Table 1: Cleavage of 75 human light chains. 





Enzyme 


Recocrnition* 


Nch 


Ns 


Planned location of 




Af el 


AGCgct 


0 


0 






Afiri 


Cttaag 


0 


0 


HC FR3 


5 


Age I 


Accggt 


0 


0 






AscX 


GGcgcgcc 


0 


0 


After LC 




Bglll 


Agatct 


0 


0 






BsiWI 


Cgtacg 


0 


0 




in 


BspDI 


ATcgat 


0 


0 




Bs sHII 


Gcgcgc 


0 


0 






BstBI 


TTcgaa 


0 


0 








(_AL.NNNg tg 


0 


0 






EagI 


Cggccg v 


0 


0 






Fsel 


GGCCGGcc 


0 


0 




I D 


Fspl 


TGCgca 


0 


0 






Hpal 


GTTaac 


0 


0 






Mfel 


Caattg 


0 


0 


HC FR1 




Mlul 


Acgcgt 


0 


0 




20 


Ncol 


Ccatgg 


0 


0 


Heavy chain signal 


Nhel 


Gctagc 


0 


0 


HC/anchor linker 




NotI 


GCggccgc 


0 


0 


In linker after HC 




Nrul 


TCGcga 


0 


0 






Pad 


TTAATtaa 


0 


0 




d 


Pmel 


GTTTaaac 


0 


0 






Pmll 


CACgtg 


0 


0 






Pvul 


C GAT eg 


0 


0 




vJ 


Sacll 


CCGCgg 


0 


0 






Sail 


Gtcgac 


0 


0 






Sfil 


GGCCNNNNnggcc 


0 


0 


Heavy Chain signal 




Sgf I 


GCGATcgc 


0 


0 




p 


SnaBI 


TACgta 


0 


0 




%'■ * 


StuI 


AGGcct 


0 


0 






XbaX 


Tctaga 


0 


0 


HC FR3 




Aatll 


GACGTc 


1 


1 




35 


Acll 


AAcgtt 


1 


1 






As el 


ATtaat 


1 


1 






BsmI 


GAATGCN 


1 


1 




(3 


BspEI 


Tecgga 


1 


1 


HC FR1 


BstXI 


CCANNNNNn tgg 


1 


1 


HC FR2 


No 


DrdI 


GACNNNNnngtc 


1 


1 






Hindlll 


Aagctt 


1 


1 






Pcil 


Acatgt 


1 


1 






Sapl 


gaagagc 


1 


1 




45 


Seal 


AGTact 


1 


1 




SexAI 


Accwggt 


1 


1 






Spel 


Actagt 


1 


1 






Tlil 


Ctcgag 


1 


1 






Xhol 


Ctcgag 


1 


1 




50 


Bcgl 


egannnnnntge 


2 


2 




Blpl 


GCtnagc 


2 


2 






BssSI 


Ctcgtg 


2 


2 






BstAPI 


GCANNNNntgc 


2 


2 






Espl 


GCtnagc 


2 


2 




55 


KasI 


Ggcgcc 


2 


2 




PflMI 


CCANNNNntgg 


2 


2 






XmnI 


GAANNnnttc 


2 


2 
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Express Mail Label 
NO.EM25454S92US 



10 



15 



20 



CO 

Id 



f J 

Uo 



45 



50 



55 



ApaLI 


Gtgcac 


3 


3 


Nael 


GCCggc 


3 


3 


NgoMI 


Gccggc 


3 


3 


PvuII 


CAGctg 


3 


3 


RsrII 


CGgwccg 


3 


3 


BsrBI 


GAGcgg 


4 


4 


BsrDI 


GCAATGNNn 


4 


4 


BstZl7l 


GTAtac 


4 


4 


ECORI 


Gaattc 


4 


4 


SphI 


GCATGc 


4 


4 


Sspl 


AATatt 


4 


4 


AccI 


GTmkac 


5 


5 


Bell 


Tgatca 


5 


5 


BsmBI 


Nnnnnngagacg 


5 


5 


BsrGI 


Tgtaca 


5 


5 


Dral 


TTTaaa 


6 


6 


Ndel 


CAtatg 


6 


6 


Swal 


ATTTaaat 


6 


6 


BamHI 


Ggatcc 


7 


7 


SacI 


GAGCTC 


7 


7 


BciVI 


GTATCCNNNNNN 


8 


8 


BsaBI 


GATNNnnatc 


8 


8 


Wsil 


ATGCAt 


8 


8 


Bspl20I 


Gggccc 


9 


9 


Apal 


GGGCCc 


9 


9 


PspOOMI 


Gggccc 


9 


9 


BspHI 


Tcatga 


9 


11 


EcoRV 


GATatc 


9 


9 


Ahdl 


GACNNNnngtc 


11 


11 


Bbsl 


GAAGAC 


11 


14 


Psil 


* TTAtaa 


12 


12 


Bsal 


GGTCTCNnnnn 


13 


15 


Xraal 


Cccggg 


13 


14 


Aval 


Cycgrg 


14 


16 


Bgll 


GCCNNNNnggc 


14 


17 


AlwNI 


CAGNNNctg 


16 


16 


BspMI 


ACCTGC 


17 


19 


Xcml 


CCANNNNNnnnntgg 


17 


26 


BstEII 


Ggtnacc 


19 


22 


Sse8397T 


CCTGCAgg 


20 


20 


Avrll 


Cctagg 


22 


22 


Hindi 


GTYrac 


22 


22 


Bsgl 


GTGCAG 


27 


29 


MSCI 


TGGcca 


30 


34 


BseRl 


NNnnnnnnnnctcctc 


32 


35 


Bsu36I 


CCtnagg 


35 


37 


PstI 


CTGCAg 


35 


40 


Ecil 


nnnnnnnnntccgcc 


38 


40 


PpuMI 


RGgwccy 


41 


50 


Styl 


Ccwwgg 


44 


73 


EcoO102I 


RGgnccy 


46 


70 


Acc65I 


Ggtacc 


50 


51 


Kpnl 


GGTACc 


50 


51 


Bpml 


ctccag 


53 


82 


Avail 


Ggwcc 


71 


124 



LC signal seq 



CHI 
CHI 



HC FR4 



* cleavage occurs in the top strand after the last upper-case base. For REs 
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that cut palindromic sequences, the lower strand is cut at the symmetrical 
site . 



Table 2: Cleavage of 79 human heavy chains 



10 



15 



20 



f r= 

w 

130 

% 
V- 

Up 



45 



50 



55 



Enzyme 


Recoanition 


Nch 


Ns 


Afel 


AGCgct 


0 


0 


Aflll 


Cttaag 


0 


0 


AscI 


GGcgcgcc 


0 


0 


BsiWI 


Cgtacg 


o 


o 


BspDI 


ATcgat 


0 


o 


BssHII 


Gcgcgc 


0 


0 


Fsel 


GGCCGGcc 


o 


o 


Hpal 


GTTaac 


o 


o 


Nhel 


Gctagc 


o 


o 


IMO LI 


GCggccgc 


u 


0 


Ttfy-ii T 
IMlU J. 




u 


u 


vt _ • T 
NSll 


ATGCAt 


0 


0 


fdC 1 




U 


u 


Pcil 


Acatgt 


0 


0 


Pmel 


GTTTaaac 


0 


0 


Pvu I 


C GAT eg 


0 


0 


Rsr II 


CGgwccg 


0 


0 




y etay ay C 


u 


r\ 

u 


C<T« T 


c^ij^cj.NJNNNnggcc 


0 


0 


Q«f T 

bgll 


c 7\ t> ^- ^ 
(jj^trAi cgc 


U 


0 


Swd I 


ATTTaaat 


0 


0 


Acll 


AAcgtt 


1 


1 


Age I 


Accggt 


1 


1 


Asel 


ATtaat 


1 


1 


AvrTT 

AVI X X 


Cc tagg 


T 
X 


1 
X 


BsmI 




1 
X 


X 


BsrBI 


GAGcgg 


1 


1 


BsrDI 


GCAATGNNn 


1 


1 


Dral 


TTTaaa 


1 


1 


Fspl 


TGCgca 


1 


1 


Hindlll 


Aagctt 


1 


1 


Mfel 


Caatfcg 


1 


1 


Nael 




1 

X 


1 

X 


NgoMI 


G f^r* ft frr* - 






Spel 


Actagt 






Acc65l 


Ggtacc 


2 


2 


BstBI 


TTcgaa 


2 


2 


Kpnl 


GGTACc 


2 


2 


Mlul 


Acgcgt 


2 


2 


NCOI 


Ccatgg 


2 


2 


Ndel 


CAtatg 


2 


2 


Pmll 


CACgtg 


2 


2 


Xcml 


CCANNNNNnnnntgg 


2 


2 


Bcgl 


egannnnnntge 


3 


3 


Bell 


Tgatca 


3 


3 


Bgll 


GCCNNNNnggc 


3 


3 


BsaBI 


GATNNnnatC 


3 


3 


BsrGI 


Tgtaca 


3 


3 


. SnaBI 


TACgta 


3 


3 


Sse8387l 


CCTGCAgg 


3 


3 



Planned location of site 



HC FR3 
After LC 



HC Linker 

In linker, HC/anchor 



HC signal seq 



In HC signal seq 
HC FR4 
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ApaXiI 


Gtgcac 


4 


4 


BspHI 


Tcatga 


4 


4 


BssSI 


Ctcgtg 


4 


4 


Psi I 


TTAtaa 


4 


5 


SphI 


GCATGc 


4 


4 


Ahdl 


GACNNNnngtc 


5 


5 


BspEI 


Tccgga 


5 


5 


MscI 


TGGcca 


5 


5 


Sad 


GAGCTc 


5 


5 


Seal 


AGTact 


5 


5 


SexAI 


Accwggt 


5 


6 


Sspl 


AATatt 


5 


5 


Tlil 


Ctcgag 


5 


5 


Xhol 


Ctcgag 


5 


5 


Bbsl 


GAAGAC 


7 


8 


BstAPI 


GCANNNNntgc 


7 


8 


BstZ17I 


GTAtac 


7 


7 


EcoRV 


GATatc 


7 


7 


EcoRI 


Gaattc 


8 


8 


BlpI 


GCtnagc 


9 


9 


Bsu36l 


CCtnagg 


9 


9 


Dralll 


CACNNNgtg 


9 


9 


Espl 


GCtnagc 


9 


9 


StuI 


AGGcct 


9 


13 


Xbal 


Tctaga 


9 


9 


Bspl20l 


Gggccc 


10 


11 


Apal 


GGGCCc 


10 


11 


PspOOMI 


Gggccc 


10 


11 


BciVI 


GTATCCNNNNNN 


11 


11 


Sail 


Gtcgac 


11 


12 


DrdI 


GACNNNNnngtc 


12 


12 


KasI 


Ggcgcc 


12 


12 


Xmal 


Cccggg 


12 


14 


Bglll 


Agatct 


14 


14 


Hindi 


GTYrac 


16 


18 


BamHI 


Ggatcc 


17 


17 


PflMI 


CCANNNNntgg 


17 


18 


BsmBI 


Nnnnnngagacg 


18 


21 


BstXI 


CCANNNNNntgg 


18 


19 


XmnI 


GAANNnnttc 


18 


18 


Sacll 


CCGCgg 


19 


19 


PstI 


CTGCAg 


20 


24 


PvuII 


CAGctg 


20 


22 


Aval~ 


Cycgrg 


21 


24 


EagI 


Cggccg 


21 


22 


Aatll 


GACGTc 


22 


22 


BspMI 


ACCTGC 


27 


33 


AccI 


GTmkac 


30 


43 


Styl 


Ccwwgg 


36 


49 


AlwNI 


CAGNNNctg 


38 


44 


Bsal 


GGTCTCNnnnn 


38 


44 


PpuMI 


RGgwccy 


43 


46 


Bsgl 


GTGCAG 


44 


54 


BseRI 


NNnnnnnnnnctcctc 


48 


60 


Ecil 


nnnnnnnnntccgcc 


52 


57 


BstEII 


Ggtnacc 


54 


61 


EcoO109I 


RGgnccy 


54 


86 



LC Signal /FR1 



HC FR1 



HC FR3 
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Bpml ctccag 
Avail Ggwcc 



60 121 
71 140 



Table 5 (amondmi) : Use of Fokl as "Universal Restriction Enzyme" 



Fokl - for dsDNA, | represents sites of cleavage 

sites of cleavage 
5 1 -cac GGATG tg — nnnnnnn | nnnnnnn-3 1 ( SEQ ID NO : 1 5 ) 
3'-crtc rCCTAC ac — nnnnnnnnnnn | nnn-5 ' ( SEQ ID NO : 1 6 ) 
RECOG 

NITion of FoJtl 



Case I 



5 1 - . . . gtg | tatt-actgtgc . . Substrate .-3' (SEQ ID NO: 17) 

3 1 -cac- ataa I tgacaca n 

qtGTAGGcac\ 
5'- caCATCCgtg/ (SEQ ID NO: 18) 



Case II 



5 . . gtgtatt | agac-tgc. .Substrate ~3'(SEQ ID N0:19) 

r -cacataa -tctg 1 acg-5 1 
/gtgCCTACac 

\ cacGGATGtg- 3 ' (SEQ ID NO:20) 

Case III (Case I rotated 180 degrees) 

/gtgCCTACac-5 ' 
\ ca cGGAT Gtg—, 

gtqtctt I acaa-tcc-3 * Adapter (SEQ ID NO : 2 1 ) 
3'-. . .cacagaa^tgtc|agg. .substrate. . . .-5* (SEQ ID NO:22) 

Case IV (Case II rotated 180 degrees) 



3'- gtGTAGGcac\ (SEQ ID NO: 23) 
I— caCATCCgtg/ 
S'-qaqj tctc-actaaoc 
Substrate 3'-. . . ctc-agag | tgactcg . . .-5' (SEQ ID NO l 24) 

Improved Fokl adapters 

Fokl - for dsDNA, | represents sites of cleavage 
Case I 

Stem 11, loop 5, stem 11, recognition 17 

5'-. . . catgtgl tatt-actgtgc. .Substrate. . . .-3' 
3 1 -qtacac- ataa I taacacq n r T— , 

qtGTAGGcacG T 
5 1 - caCATCCgtgc C 

Case II 

Stem 10, loop 5, stem 10, recognition 18 

5 ' - . . . gtgtatt I agac-tgctgcc . . Substrate .... -3 1 
r T i " t — cacataa -tctg I acgacgg-5 ' 

p T gtgCCTACac 

J 



C cacGGATGtg-3 ' 

Case III (Case I rotated 180 degrees) 
Stem 11, loop 5, stem 11, recognition 20 

r T n 

T TgtgCCTACac-5 1 
G AcacGGATGtg— ! 

LttJ at^tctt lacag-tccattctg-3 1 Adapter 

'~Z 3 1 - . . . cacagaa-tgtc | aggtaagac . . substrate .... -5 1 

Case IV (Case II rotated 180 degrees) 
\.j Stem 11, loop 4, stem 11, recognition 17 

C3 

Ljl 3 1 - gtGTAGGcacc T 

i— caCATCCqtag T 
5 1 -atcgag 1 tctc-actgaac LtJ 
Substrate 3'-. . . tagctc-agag | tgactcg . . .-5* 



C3 



BseRI 



I sites of cleavage 
5 ' -cac GAGGAG nnnnnnnnnn | nnnnn-3 ' 
3 ' -gtgctcctcnnnnnnnn | nnnnnrm-5 ' 
RECOG 

NITion of BseRI 

Stem 11, loop 5, stem 11, recognition 19 

3 ? - gaacat | cg-ttaagccagta 5 r 

r T - T l cttgta-gc | aattcggtcat-3 ' 

C GCTGAGGAGTC-- J 

T cgactcctcag-5 1 An adapter for BseRI to cleave the substrate above. 
L T 1 



id 

u 



Yii 

C3 





Table 8: Matches to URE 


FR3 adapters in 7 9 human HC. 






ri * bib U UL 


Heavy-chains 


genes sampled 








/ir uuo jdo 








ii5Az3o b / b 


JibU y Z 4jz 


HSZ93860 








AFlU33b / 




HSAZ 35b / D 


noUi? 4 1Z 


HSZ93863 


5 


/\r iU JUZ o 




AF1 l)33bo 




HSAz3d674 


u on n /i / 1 c 

HSUy 4 4 lo 


MCOMFRAA 




=i -fi n^n 
ai jl ujuoj 




AF103369 




HSA235673 


r_i o n O A /i i £T 


MCOMFRVA 




Ar IUjU bl 




AF1 0337 0 




HSA24055 9 


HSU944 17 


S82745 




/\r x u j>u / z 




af 10337 1 




HSCB201 


1 1 c* 1 n n A /it o 

HSU y 4 4 lb 


S82764 




_ ri n t> ri *7 q 

ariUJU /b 




AF103372 




HSIGGVHC 


HSU963o 9 


S83240 


10 


AF1030 99 




AF158381 




HSU44791 


HSU96391 


SABVH369 




AF103102 




E05213 




HSU44793 


HSU96392 


SADEIGVH 




AF1031G3 




E05886 




HSU82771 


HSU96395 


SAH2IGVH 




AF103174 




E05887 




HSU82949 


HSZ93849 


SDA3IGVH 


O 


AF103186 




HSA235661 




HSU82950 


HSZ93850 


SIGVHTTD 


^5 

18 


afl03187 




HSA235664 




HSU82952 


HSZ93851 


SUK4IGVH 




AF103195 




HSA235660 




HSU82961 


HSZ93853 




^.j 

S ; 2 


afl03277 




HSA235659 




HSU86522 


HSZ93855 




% X? 


afl03286 




HSA235678 




HSU86523 


HSZ93857 




h '• 1 

CI 


AF103309 




HSA235677 










20 

IS Ef 


Table 8 B. Testing all distinct GLG 


s from bases 89.1 to 93.2 of the heavy variable domain 




Id 


Nb 


0 1 


2 


3 4 




SEQ ID NO: 




1 


38 


15 11 


10 


0 2 Seql 


gtgtattactgtgc 


25 




2 


19 


7 6 


4 


2 0 Seq2 


gtAtattactgtgc 


26 


25 


3 


1 


0 0 


1 


0 0 Seq3 


gtgtattactgtAA 


27 


4 


7 


1 5 


1 


0 0 Seq4 


gtgtattactgtAc 


28 




5 


0 


0 0 


0 


0 0 Seq5 


Ttgtattactgtgc 


29 




6 


0 


0 0 


0 


0 0 Seq6 


TtgtatCactgtgc 


30 




7 


3 


1 0 


1 


1 0 Seq7 


ACAtattactgtgc 


31 


30 


8 


2 


0 2 


0 


0 0 Seq8 


ACgtattactgtgc 


32 


9 


9 


2 2 


4 


1 0 Sea9 


ATcrtat tact at ac 


33 




Group 




26 26 


21 


4 2 








Cumulative 




26 52 


73 


77 79 







[51] 



10 



15 



\2 



Table 8C Most important URE recognition seqs in FR3 Heavy 

1 VHSzyl GTGtattactgtgc (ON_SHC103) (SEQ ID NO: 25) 

2 VHSzy2 GTAtattactgtgc (ON_SHC323) (SEQ ID NO: 26) 

3 VHSzy4 GTGtattactgtac (ON_SHC349) (SEQ ID NO: 28) 

4 VHSzy9 ATGtattactgtgc (ON_SHC5a) (SEQ ID NO: 33) 

Table 8D, testing 79 human HC V genes with four probes 

Number of sequences 79 

Number of bases 29143 

Number of mismatches 



Id 


Best 


0 


1 


2 


3 


4 


5 














1 


39 


15 


11 


10 


1 


2 


0 


Seql 


gtgtattactgtgc 


(SEQ 


ID 


NO: 


25) 


2 


22 


7 


6 


5 


3 


0 


1 


Seq2 


gtAtattactgtgc 


(SEQ 


ID 


NO: 


26) 


3 


7 


1 


5 


1 


0 


0 


0 


Seq4 


gtgtattactgtAc 


(SEQ 


ID 


NO: 


28) 


4 


11 


2 


4 


4 


1 


0 


0 


Sea9 


ATatattactatac 


(SEQ 


ID 


NO: 


33) 


Group 




25 


26 


20 


5 


2 












Cumulative 


25 


51 


71 


76 


78 

















$ One sequence has five mismatches with sequences 2, 4, and 9; it is scored as best for 2. 

Ih Id is the number of the adapter. 

J„ Best is the number of sequence for which the identified adapter was the best available. 

3 ~ The rest of the table shows how well the sequences match the adapters. For example, there are 11 

%% sequences that match VHSzyl (I d=l) with 2 mismatches and are worse for all other adapters. In 

tl this sample, 90% come within 2 bases of one of the four adapters. 



Table 130: PCR primers for amplification of human Ab genes 

(HulgMFOR) 5 ' -tgg aag agg cac gtt ctt ttc ttt-3 f 

I (HulgMFOREtop) 5 ' -aaa gaa aag aac gtg cct ctt cca-3' = reverse complement 

{ HuC k FOR) 5'-aca etc tec cct gtt gaa get ctt-3' 

(HUCL2FOR) 5 f -tga aca ttc tgt agg ggc cac tg-3' 

(HuCL7FOR) 5*-aga gca ttc tgc agg ggc cac tg-3 f 
! Kappa 

(CKForeAsc) 5* -acc gec tec acc ggg cgc gec tta tta aca etc tec cct gtt- 
gaa get ctt-3 f 

(CL2ForeAsc) 5* -acc gec tec acc ggg cgc gec tta tta tga aca ttc tgt- 
agg ggc cac tg-3 T 

(CL7ForeAsc) 5' -acc gec tec acc ggg cgc gec tta tta aga gca ttc tgc- 
agg ggc cac tg-3 1 



Table 195: Human GLG FR3 sequences 
! VH1 

! 66 67 68 69 70 71 72 73 74 75 76 77 78 19 80 





ay y 


gtc 




atg acc agg 


gac 


acg 


tec 


ate 


age 


aca 


gee 


tac 


atg 




! 81 


82 


fi 9 A 

o £. a 


ft 9K QO« Q O 
O^D OZC O O 


ft A 




86 


87 


88 


89 


90 


91 


92 




eta rr 






agg ctg aga 


ccu 


gac 


gac 


acg 


gec 


gtg 


tat 


tac 


tgt 




! 93 


94 


95 






















J 


gcg 


aga 


ga 


I X— U<£# 1 






















aga 


gtc 




aut ayy 


gac 


aca 


tec 


gcg 


age 


aca 


gec 


tac 


atg 




y ay 


ctg 


age 


age ctg aga 


tCt 


gaa 


gac 


acg 


get 


gtg 


tat 


tac 


tgt 




gcg 


aga 


ga 
























aga 


gtc 


a cc 


atg acc agg 


aac 


acc 


tec 


ata 


age 


aca 


gee 


tac 


atg 


10 


gag 
gcg 


ctg 
aga 


age 

gg 


age ctg aga 
l-08# 3 


tct 


gag 


gac 


acg 


gec 


gtg 


tat 


tac 


tgt 




aga 


gtc 


a cc 


atg acc aca 


gac 


aca 


tec 


acg 


age 


aca 


gee 


tac 


atg 




gag 


ctg 


agg 


age ctg aga 


tct 


gac 


gac 


acg 


gec 


gtg 


tat 


tac 


tgt 




gcg 


aga 


ga 


1-18# 4 




















1 *s 


aga 


gtc 


acc 


atg acc gag 


gac 


aca 


tct 


aca 


gac 


aca 


gee 


tac 


atg 




g a g 


ctg 


age 


age ctg aga 


tct 


gag 


gac 


acg 


gec 


gtg 


tat 


tac 


tgt 


y 

£3 


gca 


aca 


ga i 


l-24# 5 




















aga 


gtc 


acc 


att acc agg 


gac 


agg 


tct 


atg 


age 


aca 


gec 


tac 


atg 




gag 


ctg 


age 


age ctg aga 


tct 


gag 


gac 


aca 


gee 


atg 


tat 


tac 


tgt 


^0 


gca 


aga 


ta ! 


l-45# 6 




















o. 


aga 


gtc 


acc 


atg acc agg 


gac 


acg 


tec 


acg 


age 


aca 


gtc 


tac 


atg 




gag 


ctg 


age 


age ctg aga 


tct 


gag 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


5 


gcg 


aga 


ga ! 


l-46# 7 





















V -3 



clCf SL 


gtc 


acc 


gag 


ctg 


age 


gcg 


gca 


ga 


aga 


gtc 


acg 


gag 


ctg 


age 


gcg 


aga 


ga 


aga 


gtc 


acg 


gag 


ctg 


age 


gcg 


aga 


ga 


aga 


gtc 


acc 


gag 


ctg 


age 


gca 


aca 


ga 



VH2 
agg etc acc 
aca atg acc 
gca cac aga 
agg etc acc 
acc atg acc 
gca egg ata 
agg etc acc 
aca atg acc 
gca egg ata 

VH3 



cga 


ttc 


acc 


caa 


atg 


aac 


gcg 


aga 


ga 


cga 


ttc 


acc 


caa 


atg 


aac 


gca 


aaa 


gat 


cga 


ttc 


acc 


caa 


atg 


aac 


gcg 


aga 


ga 


cga 


ttc 


acc 


caa 


atg 


aac 


gca 


aga 


ga 


aga 


ttc 


acc 


caa 


atg 


aac 


acc 


aca 


ga 


cga 


ttc 


acc 



att acc agg 
age ctg aga 

! l-58# 8 
att acc gcg 
age ctg aga 

! l-69# 9 
att acc gcg 
age ctg aga 

! l-e# 10 
ata acc gcg 
age ctg aga 

1 l-f# 11 

ate acc aag 
aac atg gac 
c! 2-05# 12 
ate tec aag 
aac atg gac 
c! 2-26# 13 
ate tec aag 
aac atg gac 
c! 2-70# 14 

ate tec aga 
age ctg aga 

! 3-07# 15 
ate tec aga 
agt ctg aga 
a! 3-09#16 
ate tec agg 
age ctg aga 

! 3-ll# 17 
ate tec aga 
age ctg aga 

! 3-13# 18 
ate tea aga 
age ctg aaa 

! 3-15# 19 
ate tec aga 



gac atg tec 
tec gag gac 

gac gaa tec 
tct gag gac 

gac aaa tec 
tct gag gac 

gac acg tct 
tct gag gac 

gac acc tec 
cct gtg gac 

gac acc tec 
cct gtg gac 

gac acc tec 
cct gtg gac 

gac aac gec 
gec gag gac 

gac aac gee 
get gag gac 

gac aac gec 
gec gag gac 

gaa aat gee 
gec ggg gac 

gat gat tea 
acc gag gac 

gac aac gee 



aca age aca 
acg gee gtg 

acg age aca 
acg gee gtg 

acg age aca 
acg gee gtg 

aca gac aca 
acg gec gtg 

aaa aac cag 
aca gec aca 

aaa age- cag 
aca gec aca 

aaa aac cag 
aca gec acg 

aag aac tea 
acg get gtg 

aag aac tec 
acg gee ttg 

aag aac tea 
acg gee gtg 

aag aac tec 
acg get gtg 

aaa aac acg 
aca gee gtg 

aag aac tec 



gee tac atg 
tat tac tgt 

gee tac atg 
tat tac tgt 

gec tac atg 
tat tac tgt 

gee tac atg 
tat tac tgt 

gtg gtc ctt 
tat tac tgt 

gtg gtc ctt 
tat tac tgt' 

gtg gtc ctt 
tat tac tgt 

ctg tat ctg 
tat tac tgt 

ctg tat ctg 
tat tac tgt 

ctg tat ctg 
tat tac tgt 

ttg tat ctt 
tat tac tgt 

ctg tat ctg 
tat tac tgt 

ctg tat ctg 





caa 


at g 




gcg 


aga 




cga 


ttc 




caa 


atg 


< 


gcg 


aga 




egg 


ttc 




caa 


atg 




gcg 


aaa 




cga 


ttc 


JO 


caa 


atg 




g<=g 


aaa 




cga 


ttc 




caa 


atg 




gcg 


aga 


15 




ttc 




caa 


atg 


- _ 








gcg 


aaa 




cga 


ttc 










caa 


atg 


~2fJ 


gcg 


aga 




cga 


ttc 


O 






in 


caa 


atg 




gca 


aaa 




cga 


ttc 




caa 


atg 




gcg 


aga 




aga 


ttc 


'f.A 








caa 


atg 




act 


aga 




cga 


ttc 




caa 


atg 




gcg 


aga 




aga 


ttc 




caa 


atg 


35 


gcg 


aga 




aga 


ttc 




caa 


atg 




gcg 


aga 




aga 


ttc 



aac agt ctg aga 
ga ! 3-20# 20 
acc ate tec aga 
aac age ctg aga 
ga ! 3-21# 21 
acc ate tec aga 
aac age ctg aga 
ga ! 3-23# 22 
acc ate tec aga 
aac age ctg aga 
ga » 3-30# 23 
acc ate tec aga 
aac age ctg aga 
ga ! 3303# 24 
acc ate tec aga 
aac age ctg aga 
ga ! 3305# 25 
acc ate tec aga 
aac age ctg aga 
ga ! 3-33# 26 
acc ate tec aga 
aac agt ctg aga 
gat a! 3-43#27 
acc ate tec aga 
aac age ctg aga 
ga ! 3-48# 28 
acc ate tea aga 
aac age ctg aaa 
ga ! 3-49# 29 
acc ate tec aga 
aac age ctg aga 
ga ! 3-53# 30 
acc ate tec aga 
ggc age ctg aga 
ga ! 3-64# 31 
acc ate tec aga 
aac age ctg aga 
ga ! 3-66# 32 
acc ate tea aga 



gee gag gac acg 

gac aac gee aag 
gec gag gac acg 

gac aat tec aag 
gec gag gac acg 

gac aat tec aag 
get gag gac acg 

gac aat tec aag 
get gag gac acg 

gac aat tec aag 
get gag gac acg 

gac aat tec aag 
gec gag gac acg 

gac aac age aaa 
act gag gac acc 

gac aat gec aag 
gac gag gac acg 

gat ggt tec aaa 
acc gag gac aca 

gac aat tec aag 
gec gag gac acg 

gac aat tec aag 
get gag gac atg 

gac aat tee aag 
get gag gac acg 

gat gat tea aag 



gec ttg tat cac tgt 

aac tea ctg tat ctg 
get gtg tat tac tgt 

aac acg ctg tat ctg 
gec gta tat tac tgt 

aac acg ctg tat ctg 
get gtg tat tac tgt 

aac acg ctg tat ctg 
get gtg tat tac tgt 

aac acg ctg tat ctg 
get gtg tat tac tgt 

aac acg ctg tat ctg 
get gtg tat tac tgt 

aac tec ctg tat ctg 
gee ttg tat tac tgt 

aac tea ctg tat ctg 
get gtg tat tac tgt 

age ate gec tat ctg 
gee gtg tat tac tgt 

aac acg ctg tat ctt 
gee gtg tat tac tgt 

aac acg ctg tat ctt 
get gtg tat tac tgt 

aac acg ctg tat ctt 
get gtg tat tac tgt 

aac tea ctg tat ctg 



caa atg aac age ctg aaa acc 
get aga ga ! 3-72# 33 
agg ttc acc ate tec aga gat 
caa atg aac age ctg aaa acc 
act aga ca I 3-73# 34 
cga ttc acc ate tec aga gac 
caa atg aac agt ctg aga gec 
gca aga ga ! 3-74# 35 
aga ttc acc ate tec aga gac 
caa atg aac age ctg aga get 
aag aaa ga ! 3-d# 36 
VH4 

cga gtc acc ata tea gta gac 

aag ctg age tct gtg acc gec 

gcg aga ga ! 4~04# 37 

cga gtc acc atg tea gta gac 

aag ctg age tct gtg acc gee 

gcg aga aa ! 4-28# 38 

cga gtt acc ata tea gta gac 

aag ctg age tct gtg act gec 

gcg aga ga ! 4301# 39 

cga gtc acc ata tea gta gac 

aag ctg age tct gtg acc gee 

gec aga ga ! 4302# 40 

cga gtt acc ata tea gta gac 

aag ctg age tct gtg act gee 

gec aga ga ! 4304# 41 

cga gtt acc ata tea gta gac 

aag ctg age tct gtg act gec 

gcg aga ga ! 4-31# 42 

cga gtc acc ata tea gta gac 

aag ctg age tct gtg ace gec 

gcg aga ga ! 4-34# 43 

cga gtc acc ata tec gta gac 

aag ctg age tct gtg acc gee 

gcg aga ca ! 4-39# 44 

cga gtc acc ata tea gta gac 

aag ctg age tct gtg acc get 

gcg aga ga ! 4-59# 45 



gag gac acg gec gtg tat tac tgt 

gat tea aag aac acg gcg tat ctg 
gag gac acg gec gtg tat tac tgt 

aac gec aag aac acg ctg tat ctg 
gag gac acg get gtg tat tac tgt 

aat tec aag aac acg ctg cat ctt 
gag gac acg get gtg tat tac tgt 

aag tec aag aac cag ttc tec ctg 
gcg gac acg gec gtg tat tac tgt 

acg tec aag aac cag ttc tec ctg 
gtg gac acg gec gtg tat tac tgt 

acg tct aag aac cag ttc tec ctg 
gcg gac acg gec gtg tat tac tgt 

agg tec aag aac cag ttc tec ctg 
gcg gac acg gec gtg tat tac tgt 

acg tec aag aac cag ttc tec ctg 
gca gac acg gec gtg tat tac tgt 

acg tct aag aac cag ttc tec ctg 
gcg gac acg gec gtg tat tac tgt 

acg tec aag aac cag ttc tec ctg 
gcg gac acg get gtg tat tac tgt 

acg tec aag aac cag ttc tec ctg 
gca gac acg get gtg tat tac tgt 

acg tec aag aac cag ttc tec ctg 
gcg gac acg gec gtg tat tac tgt 



cga gtc acc ata tea gta gac 
aag ctg age tct gtg acc get 
gcg aga ga ! 4-61# 46 
cga gtc acc ata tea gta gac 
aag ctg age tct gtg acc gee 
gcg aga ga ! 4-b# 47 
VH5 

cag gtc acc ate tea gec gac 
cag tgg age age ctg aag gec 
gcg aga ca ! 5-51# 48 
cac gtc acc ate tea get gac 
cag tgg age age ctg aag gee 
gcg aga ! 5-a# 49 
VH6 

cga ata acc ate aac cca gac 
cag ctg aac tct gtg act ccc 
gca aga ga ! 6-l# 50 
VH7 

egg ttt gtc ttc tec ttg gac 
cag ate tgc age eta aag get 
gcg aga ga ! 74. 1# 51 



acg tec aag aac cag ttc tec ctg 
9 C 9 gac acg gee gtg tat tac tgt 

acg tec aag aac cag ttc tec ctg 
gca gac acg gec gtg tat tac tgt 

aag tec ate age acc gec ^tac ctg 
teg gac acc gee atg tat tac tgt 

aag tec ate age act gec tac ctg 
teg gac acc gee atg tat tac tgt 

aca tec aag aac cag ttc tec ctg 
gag gac acg get gtg tat tac tgt 

acc tct gtc age acg gca tat ctg 
gag gac act gec gtg tat tac tgt 



J 



Table 250: REdaptors, Extenders, and Bridges used for Cleavage and Capture of 
Human Heavy Chains in FR3. 

A: HpyCH4V Probes of actual human HC genes 

!HpyCH4V in FR3 of human HC, bases 35-56; only those with TGca site 
TGca; 10, 

RE recognition: tgca of length 4 is expected at 10 

1 6-1 agttctccctgcagctgaactc 



2 3-11,3-07,3-21,3-72,3-4 8 cactgtatctgcaaatgaacag 

3 3-09,3-43,3-20 ccctgtatctgcaaatgaacag 

4 5-51 ccgcc tacc tgcagtggagcag 

5 3-15, 3-30, 3-30. 5, 3-30. 3, 3-74, 3-23, 3-33 cgctgtatctgcaaatgaacag 

6 7-4.1 cggcatatctgcagatctgcag 

7 3-73 cggcgtatctgcaaatgaacag 
B 5-a ctgcctacctgcagtggagcag 
9 3-49 tcgcctatctgcaaatgaacag 

B : HpyCH4V REdaptors, Extenders, and Bridges 
B.1 REdaptors 

! Cutting HC lower strand: 

! TmKeller for 100 mM NaCl, zero formarnide 
! Edapters fox cleavage 



(ON_HCFR36-l) 
(ON_HCFR3 6-lA) 
(0N__HCFR36-1B) 
(0N_HCFR33-15) 
(ONJHCFR33-15A) 
(ON__HCFR33-15B) 
(ON_HCFR33-ll) 
(ON HCFR35-51) 



rr. W 

5 ' -agttctcccTGCAgctgaactc-3 1 68.0 

5 ' -ttctcccTGCAgctgaactc-3 ' 62 . 0 

5 '-ttctcccTGCAgctgaac-3 ' 56.0 

5 1 -cgctgtatcTGCAaatgaacag-3 1 64.0 

5 1 -ctgtatcTGCAaatgaacag-3 1 56.0 

5 r -ctgtatcTGCAaatgaac-3* 50. 0 

S'-cactgtatcTGCAaatgaacag-S 1 62.0 

5 ' -ccgcctaccTGCAgtggagcag-3 1 74.0 



rn K 

64.5 
62.5 
59.9 
60.8 
56.3 
53.1 
58 . 9 
70.1 



BJZ Segment of synthetic 3-23 gene into which captured CDB3 is to be cloned 

Xbal . . . 

D323* cgCttcacTaag tcT aqa gac aaC tcT aag aaT acT etc taC 
scab designed gene 3-23 gene 

HpyCH4V 

.. AflXI... 
Ttg caG atg aac a ge TtA aa G . . . 



B3 Extender and Bridges 

! Extender (bottom strand) 



( ON_HCHpy Ex 01) 5 ' - cAAgTAgAgAgTATTcTTAgAeTTgT cTcTAgA cTTAgTgAAgcg - 3 ' 
ON_HCHpyEx01 is the reverse complement of 

5 1 -cgCttcacTaag tcT aqa gac aaC tcT aag aaT acT etc taC Ttg -3 1 



Bridges (top strand, 9-base overlap) : 



(ON_HCHpyBr016-l) 5 f -cgCttcacTaag tcT aja gac aaC tcT aag- 

aaT acT etc taC Ttg CAgctgaac-3' {3 '-term C is blocked} 

r 

5 ! 3-15 et al. + 3-11 

(ON_HCHpyBr023-15) 5 • -cgCttcacTaag tcT aqa gac aaC tcT aag- 

aaT acT ctC taC Ttg CAaatgaac-3' { 3' -term C is blocked} 

t 

! 5-51 

10 (ON_HCHpyBr045-51) 5 • -cgCttcacTaag tcT aqa gac aaC tcT aag- 

aaT acT etc taC Ttg CAgtggagc-3 ' {3 '-term C is blocked) 

PCR primer (top strand) 
15 <ON_HCHpyPCR) 5 1 -cgCttcacTaag tcT aqa gac-3 f 



CO 

iho 



40 



C: BlpX Prol>es from human HC GLGs 



1-58, 1-03, 1-08, 1-69, 1-24,1-4 5,1-4 6, 1-f , 1-e acatggaGCTGAGCagcctgag 

1-02 acatggaGCTGAGCaggctgag 

} ] 3 1-18 acatggagctgaggagcctgag 

1 4 5-51, 5-a acctgcagtggagcagcctgaa 

~ r 

fl 3-15,3-73,3-49,3-72 atctgcaaatgaacagcctgaa 

6 3303, 3-33, 3-07, 3-11 , 3-30, 3-21, 3-23, 3305 , 3-48 atctgcaaatgaacagcctgag 

3^- 7 3-20,3-74,3-09,3-43 atctgcaaatgaacagtctgag 

^ 74.1 atctgcagatctgcagcctaaa 

3-66, 3-13, 3-53, 3-d atcttcaaatgaacagcctgag 
3-64 atcttcaaatgggcagcctgag 
4301, 4-28, 4302, 4-04, 4304, 4-31, 4-34,4-39, 4-59, 4-61, 4-b ccctgaaGCTGAGCtctgtgac 

6-1 ccctgcagctgaactctgtgac 
2-70, 2-05 tccttacaatgaccaacatgga 

__ 2-2 6 tccttaccatgaccaacatgga 

D : Blpl REdaptors, Extenders, and Bridges 
35 D.1 REdaptors 

T B W T ra K 

(BlpF3HCl-58) 5*-ac atg gaG CTG AGC age ctg ag-3 1 70 66.4 

(BlpF3HC6-l) 5'-cc ctg aag ctg age tct gtg ac-3' 70 66,4 

! BlpF3HC6-l matches 4-30.1, not 6-1. 



3 1X 
*30 12 

13 
14 



D.2 Segment of synthetic 3-23 gene into which captured CDR3 is to be cloned 



Xbal. 



BlpI 



D323 + cgCttcacTaag TCT AGA gac aaC tcT aag aaT acT etc taC Ttg caG atg aac 



Aflll. . . 
aq C TTA AG G 



D.3 Extender and Bridges 

! Bridges 

(BlpF3Brl) 5 '-cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG- 

taC Ttg caG Ctg a | GC age ctg-3' 
(BlpF3Br2) 5 ■ -cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG- 

taC Ttg caG Ctg a | gc tct gtg-3 ? 
! I lower strand is cut here 

! Extender 
(BlpF3Ext> 5'- 

TcAgcTgcAAgTAcAAAgTATTTTTAcTgTTATcTcTAgAcTgAgTgAAgcg - 3 ' 
! BlpF3Ext is the reverse complement of: 

! 5 '-cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG taC Ttg caG Ctg a-3' 



(BlpF3PCR) 5' -cgCttcacTcag tcT aga gaT aaC-3 * 
E: HpyCH.4UI Distinct GLG sequences surrounding site, bases 77-98 



1 


102#1, 118#4,146fJ7,169#9,le#10 r 311#17, 353030,404 #37, 4 301 


ccgtgtattactgtgcgagaga 


2 


103#2,307#15, 321#21,3303#24,333#26,348#28, 364 #31, 366*32 


ctgtgtattactgtgcgagaga 


3 


108#3 


ccgtgtattactgtgcgagagg 


4 


124#5, If #11 


ccgtgtattactgtgcaacaga 


5 


145#6 


ccatgtattactgtgcaagata 


6 


158#8 


ccgtgt attactgtgcggcaga 


7 


205812 


ccacatattactgtgcacacag 


a 


226#I3 


ccacatattactgtgcacggat 


9 


270814 


ccacgtattactgtgcacggat 


10 


309#16, 343027 


ccttgtattactgtgcaaaaga 


11 


313#18,374#35,61#50 


ctgtgtattactgtgcaagaga 


12 


315819 


ccgtgtattactgtaccacaga 


13 


3201*20 


ccttgtatcactgtgcgagaga 


14 


323#22 


ccgtatattactgtgcgaaaga 


15 


330*23, 3305825 


ctgtgtattactgtgcgaaaga 


16 


349#29 


ccgtgtattactgtactagaga 


17 


372833 


ccgtgtattactgtgctagaga 


18 


373034 


ccgtgtattactgtactagaca 


19 


3d#36 


ctgtgtattactgtaagaaaga 


20 


428038 


ccgtgtattactgtgcgagaaa 


21 


4302#40, 4304041 


ccgtgtattactgtgccagaga 


22 


439#44 


ctgtgtattactgtgcgagaca 


23 


551048 


ccatgtattactgtgcgagaca 



5aff49 ccatgtattactgtgcgaga 

F: HpyCH4III REdap tors, Extenders, and Bridges 
F.l REdaptors 

! ONs for cleavage of HC (lower) in FR3 (bases 77-97) 
! For cleavage with HpyCH4III, Bst4CI, or Taal 
! cleavage is in lower chain before base 88. 



1 

(H43, 


.77. 


97, 


. 1-02#1) 


5' 


77 
78 
-cc 


788 
901 
gtg 


888 
234 
tat 


888 

567 
tAC 


889 
990 
TGT 


999 
123 
gcg 


999 
456 
aga 


9 
7 

g- 


-3' 


64 


rn K 
1 m 

62.6 


(H43, 


.77. 


97. 


. l-03#2) 


5' 


-ct 


gtg 


tat 


tAC 


TGT 


gcg 


aga 


g- 


3 1 


62 


60.6 


(H43, 


.77. 


97. 


.108#3) 


5' 


-cc 


gtg 


tat 


tAC 


TGT 


gcg 


aga 


g- 


3» 


64 


62 . 6 


(H43. 


.77. 


97, 


>323#22) 


5' 


-cc 


gta 


tat 


tac 


tgt 


gcg 


aaa 


g- 


3" 


60 


58.7 


(H43, 


.77. 


97, 


>330#23) 


5 ' 


-et 


gtg 


tat 


tac 


tgt 


gcg 


a&a 


g- 


3" 


60 


58.7 


(H43. 


.77. 


97. 


,439#44> 


5 T 


-ofc 


gtg 


tat 


tac 


tgt 


gcg 


aga 


<?- 


3' 


62 


60.6 


(H43. 


,77. 


97. 


,551#48) 


5* 


-cc 


atg 


tat 


tac 


tgt 


gcg 


aga 


c- 


3 f 


62 


60.6 


(H43. 


77. 


97. 


,5a#49> 


5" 


-cc 


atg 


tat 


tAC 


TGT 


gcg 


aga 




3» 


58 


58.3 



^ !§ F • 2 Extender and Bridges 

Ul?0 ! Xbal and Aflll sites in bridges are bunged 
J] (H43,XABrl) 5 1 -ggtgtagtga- 

L J | TCT | AGt i gac | aac | tct | aag | aat | act | etc I tac | ttg | cag \ atg | - 

Jjf I aac I age I TTt I AGq I a ct I gag I aac 1 aCT I GCA I Gt c I tac 1 tat tgt gcg aga-3' 

l*'" (H43.XABr2) 5 1 -ggtgtagtga- 

C$5 | TCT | AGt | gac I aac | tct | aag | aat | act | etc | tac | ttg | cag | atg | - 

U 1 aac 1 acTC 1 TTt 1 AGq ' qct 1 craq 1 q a <= I aCT I GCA I Gtc I tac 1 tat tgt gcg aaa-3' 

\A <H43.XA£xt) 5 * -ATAgTAgAcT gcAgTgTccT cAgcccTTAA gcTgTTcATc TgcAAgTAgA- 

W gAgTATTcTT AgAgTTgTcT cTAgAT cAcT AcAcc-3 r 

!H43.XAExt is the reverse complement of 
SO ! S'-ggtgtagtga- 

! | TCT | AGA | gac | aac | tct | aag | aat | act | etc | tac | ttg | cag | atg | - 
! | aac [ agC I TTA I AG q I get I gag 1 gac I aCT I GCAi Gtc I tac I tat: -3' 

( H4 3 . XAPCR) 5 1 -gg t gt a gtga | TCT | AGA | gac | aac -3 ' 
35 * Xbal and Aflll sites in bridges are bunged 
(H43.ABrl) 5 ' -ggtgtagtga- 

I aac | agC I TTt I AGq I get I aaa I gac I aCT I GCA I Gtc 1 tac I tat tgt gcg aga-3 ■ 
(H43 . ABr2 ) 5 ' -ggtgtagtga- 

1 aac | agC I TTt I AGcr I act I gag I gac 1 aCT I GCA 1 Gtc I tac I tat: tgt gcg aaa-3' 
40 (H43 . AExt ) 5 ' - AT Ag T AgAcT g cAgT gT c cT cAg c c c TTAAg cT gT T T cAcTAcAc c - 3 1 



!(H43.AExt) is the reverse complement of 5 1 -ggtgtagtga- 
! I aac | aqC 1 TTA I AGql get I aacr I crac I aCT I GCAI Gtc I tac I tat -3' 
(H43.APCR) 5'-ggtgtagtga \ aac I aqCj TTA \ AGg I get |q-3 1 
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10 



15 



20 



25 



mo 



1135 



50 



55 



60 



Sites to be varied > *** *** ++* 

FR1 >| . . . CDR1 | F R2 

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 
'AS G FT F S S Y A M S W V R 
IgctlTCCIGGAlttclactlttcl tct | tCG | TAC | Get | atg I tct I tgg I gtt I coC I 143 
I cga I agg | cct | aag | tga | aag | aga } age 1 atg | cga | tac | aga | acc | eaa | gcg | 

I BspEI | ( BsiWII IBstXI. 

sites to be varies- — > *** *** *** 

FR2 >| . . . CDR2 

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 
QAPGKGLEWVSAI S G 
|CAa|gct|ccTIGGt I aaa I ggt I ttg I gag| tag | gtt | tct | get | ate | tct | ggt 1 18 8 

* gtt | cga | gga | cca | ttt | cca | aac | etc I acc | caa | aga I cga ] tag | aga | cca I 



. . .BstXI 



*** *** 

CDR2 I— FR3— 

76 77 78 79 80 81 82 83 84 85 86 87 88 89 90* 
SGGSTYYADSVKGRF 
I tct | ggt | ggc 1 agt | ac 1 1 tac I t at I get I aac I tec I gtt I aaa 1 cjat Icoclttcl 
I aga | cca | ccg | tea I tga | atg I ata | cga | ctg | agg | caa | ttt | cca | gcg ! aag | 

FR3 

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 
TISRDNSKNTLYLQM 
I act I ate I TCT I AGA I gac f aac i tct I aag I aat I act j etc | tac I ttgl cagl atg | 
I tga | tagl aga I tct I ctg| ttg| aga| ttc 1 tta | tga I gag | atg | aac | gtc \ tac | 
I Xbal | 



FR3- 



>l 



106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
NSLRAEDTAVYYCAK 
t aac t agC | TTA I AGg I <rct I gag| gac I aCT I GCA | Gtc \ tac | tat I tgc | get | aaa | 
I ttg| teg | aat | tec | cga | etc | etg| tga | cgt I cag I atg I ata I acg I cga I ttt I 
lAflll | | p s tX | 

CDR3 | FR4 

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
DYEGTGYAFDIWGQG 
I gac | tat Igaalggt | act | ggt | tat I get I ttc I graC I ATA I TGg I got I c aa | ggt | 
I ctg [ ata I ctt | cca I tga I cca 1 ata I cga | aag | ctgl tat | acc | cca | gtt | cca ( 

I Ndel | 



FR4 >, 

136 137 138 139 140 141 142 
T M V T V 5 S 
I act | atG | GTC I ACC | gtc | tct | agt- 
I tga | tac | cag| tgg| cag| aga | tca^ 
I BstEII | 



389 



(SFPRMET) 
(T0PFR1A) 



(BOTFR1B) 



233 



278 



323 



368 



143 144 145 146 147 148 149 150 151 152 

ASTKGPSVFP 
gec tec acc aaG GGC CCa teg GTC TTC ccc-3' 419 
egg agg tgq ttc ccg ggt age cag aag qqg - 5 ' 
Bspl20l. Bbsl. . . (2/2) 

Apal .... 

-ctg tct gaa cG GCC cag ccG-3 " 
-ctg tct gaa cG GCC cag ccG GCC atg gec- 
gaa | gtt | CAA! TTG | tta | gag | tct | ggt | - 
I ggc I ggt | ctt I gtt I cag j cct I ggt I ggt I tct I tta-3 ' 

3'-caa|gtc|gga|cca|cca|aga|aat|gca|gaa|aga|acg|cga|- 
I cga | agg | cct | aag | tga I aag- 5* ! bottom strand 



( BOT FR2 ) 3 ' -acc | caa | gcg 1 - 

| gtt | cga | gga | cca | ttt | cca j aac ! etc | acc I caa J aga j -5 • ! bottom strand 
(BOTFR3) 3'- a|cga|ctg|agg|caa| ttt | cca | gcg I aag | - 

|tga| tag|aga| tct | ctg | ttg | aga | ttc | tta|tga|gag|atg|aac|gtc|tac|- 
I ttg | tcg| aat | tec | cga | etc | ctg | tga-5 • 
< F06 > S'-gCtTTAlAGglgctlgaglgaclaCTIGCAIGtcltacltatl tgc I get I aaa I - 

I gac | tat I gaa | ggt | act I ggt t tat 1 get 1 ttc I gaC j ATA I TGg I ggt I c-3 * 
(BOTFR4) 3'-cga|aagIctg| tat | acc | cca | gtt | cca | ~ 
I tga | tac | cag | tgg | cag | aga | tca- 

egg agg tgg ttc ccg ggt age cag aag ggg~5 ' ! bottom strand. 
(BOTPRCPRIM) 3 f -gg ttc ccg ggt age cag aag ggg-5 ' 

! CDR1 diversity 

(ON-vgCl ) 5 ' - IgctlTCCIGG&lttclactlttcl tct [<1> I TAC I <1> I atat <1> I - 

! CDR1 6659 

| tgg | gtt I cqC I CAa | get | ccT I GG - 3 ' 

!<1> stands for an equimolar mix of {ADEFGHIKLMNPQRSTVWY} ; no C 

(this is not a sequence) 

! CDR2 diversity 

(0N-vgC2 ) 5 ' -ggt | ttg I gag | tgg | gtt I tct |<2> | ate |<2> |<3> | - 

! CDR2 

I tct | ggt | ggc |<1> | act | <1> | tat | get 1 gac | tec | gtt | aaa | gg-3 1 

! CDR2 _ 

! <1> is an equimolar mixture of {ADEFGHIKLMNPQRSTVWY}; no C 
! <2> is an equimolar mixture of {YRWVGS}; no ACDEFHIKLMNPQT 
! <3> is an equimolar mixture of {PS}; no AC DE FGH I KLMNQRTVWY 



Table 800 (new) 



The following list of enzymes was taken from 
http : / /rebase . neb - com/cai-bin/as ymrrtlist . 

I have removed the enzymes that a) cut within the recognition, b) 
cut on both sides of the recognition, or c) have fewer than 2 
bases between recognition and closest cut site. 



REBASE Enzymes 
04/13/2001 



Type II 
sequence 

Enzymes 

Aarl 

Acelll 

Bbr7I 

Bbvl 

BbvII 

Bce83I 

BceAI 

Beef I 

BciVI 

Bfil 

BinI 

BscAI 

BseRI 

BsmFI 

BspMI 

Ecil 

Eco57I 

Faul 

Fokl 

Gsul 

Hgal 

HphI 

MboII 

Mlyl 

Mmel 

Mnll 

Plel 

RleAI 

SfaNI 

SspDSl 

Sthl32I 

StsI 

Taqll 

Tthllll 

UbaPI 



Bful 
Bmrl 



BspLUllIII 
Acc36I 



Suppliers 

y 



restriction enzymes with asymmetric recognition 
s : 

Recognition Sequence Isoschizomers 
CAC C T GCNNNN A NNNN__ 
CAGC T C NNNNNNN A NNNN_ 
G AAG AC NNNNNNN A NNNN_ 
GC AGCNNNNNNNN A NNNN__ 
GAAGACNN" NNNN_ 
C T TGAGNNNNNNNNIW^ - 
ACGGCNNNNNNNNNNNN^NN 



ACGGCNNNNNNNNNNNN"N_ 
G TAT C CNNNNN_N A 
AC T G G GNNNN_N A 
GGATCNNNN A N_ 
GCATCNNNN A NN_ 
GAG GAGNNNNNNNN__NN A 
G G GACNNNNNNNNNN A NNNN_ 
ACCTGCNNNN A NNNN_ 
GGCGGAISrNNNNNNNN_NN /v 

C T GAAGNNNNTSFNNNNNNNN A BspKT5I 
CCCGCNNNN A NN_ BstFZ438l 
GGATGNNNNNNNNN^ NNNN_ Bst P24 1 8 1 
C T GGAGNNNNNNNNNNNNNN_NN A 
G AC G C NNNNN A NNNNN_ 
G G T G ANNNNNNN_N A AsuHPI 
GAAG ANNNNNNN_N " 
GAGTCNNNNN^ SchI 
T C C RAC NNNNNNNNNNNNRNKNNN_NN " 
CCTCNNNNNNJST 

GAGTCNNNN A N_ PpsI 
C C CAC ANNNNNNNNN_NNN A 
G CATCNNNNN A NNNN_ BspS T5 1 

GGTGANNNNNNNN* 
CCCGNNNN A NNNN_ 
GGATGNNNNNNNNNN A NNNN_ - 

G AC C G ANNNNNNNNN_NN A , CAC C C ANNNN NNNNN_NN ' 
I CAARCANNNNNNNNN_NN A - ~ 

CGAACG 



y 
y 



y 
y 
y 
y 
y 
y 
y 
y 
y 
y 
y 
y 

y 
y 



The notation is A means cut the upper strand and _ means cut the 
lower strand. If the upper and lower strand are cut at the same 
place, then only A appears. 



Table 12 0: MALI A3 , annotated 
! MALIA3 9532 bases 
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aat 


get 


act 


act 


att 


agt 


aga 


att 


gat 


gec 


acc 


ttt 


tea 


get 


cgc 


gec 


gene ii continued 


























49 


cca 


aat 


gaa 


aat 


ata 


get 


aaa 


cag 


gtt 


att 


gac 


cat 


ttg 


cga 


aat 


gta 


97 


tct 


aat 


ggt 


caa 


act 


aaa 


tct 


act 


cgt 


teg 


cag 


aat 


tgg 


gaa 


tea 


_act 


145 


gtt 


aca 


tgg 


aat 


gaa 


act 


tec 


aga 


cac 


cgt 


act 


tta 


gtt 


gca 


tat 


tta 


193 


aaa 


cat 


gtt 


gag 


eta 


cag 


cac 


cag 


att 


cag 


caa 


tta 


age 


tct 


aag 


cca 


241 


tec 


gca 


aaa 


atg 


ace 


tct 


tat 


caa 


aag 


gag 


caa 


tta 


aag 


gta 


etc 


tct 


289 


aat 


cct 


gac 


ctg 


ttg 


gag 


ttt 


get 


tec 


ggt 


ctg 


gtt 


cgc 


ttt 


gaa 


get 


337 


cga 


att 


aaa 


acg 


cga 


tat 


ttg 


aag 


tct 


ttc 


ggg 


ctt 


cct 


ctt 


aat 


ctt 


385 


ttt 


gat 


gca 


ate 


cgc 


ttt 


get 


tct 


gac 


tat 


aat 


agt 


cag 


ggt 


aaa 


gac 


433 


ctg 


att 


ttt 


gat 


tta 


tgg 


tea 


ttc 


teg 


ttt 


tct 


gaa 


ctg 


ttt 


aaa 


gca 


481 


ttt 


gag 


999 


gat 


tea 


ATG 


aat 


att 


tat 


gac 


gat 


tec 


gca 


gta 


ttg 


gac 












Start gene x, ii continues 










529 


get 


ate 


cag 


tct 


aaa 


cat 


ttt 


act 


att 


acc 


ccc 


tct 


ggc 


aaa 


act 


tct 


577 


ttt 


gca 


aaa 


gec 


tct 


cgc 


tat 


ttt 


ggt 


ttt 


tat 


cgt 


cgt 


ctg 


gta 


aac 


625 


gag 


ggt 


tat 


gat 


agt 


gtt 


get 


ctt 


act 


atg 


cct 


cgt 


aat 


tec 


ttt 


tgg 


673 


cgt 


tat 


gta 


tct 


gca 


tta 


gtt 


gaa 


tgt 


ggt 


att 


cct 


aaa 


tct 


caa 


ctg 


721 


atg 


aat 


ctt 


tct 


acc 


tgt 


aat 


aat 


gtt 


gtt 


ccg 


tta 


gtt 


cgt 


ttt 


att 


769 


aac 


gta 


gat 


ttt 


tct 


tec 


caa 


cgt 


cct 


gac 


tgg 


tat 


aat 


gag 


cca 


gtt 


817 


ctt 


aaa 


ate 


gca 


TAA 


































End 


X 6c 


II 




















832 


ggtaattca ca 




























Ml 








E5 










Q10 










T15 




843 


ATG 


att 


aaa 


gtt 


gaa 


att 


aaa 


cca 


tct 


caa 


gec 


caa 


ttt 


act 


act 


cgt 




Start gene V 




























S17 






S2 0 










P25 










E30 






891 


tct 


ggt 


gtt 


tct 


cgt 


cag 


ggc 


aag 


cct 


tat 


tea 


ctg 


aat 


gag 


cag 


ctt 








V35 










E4 0 










V45 








939 


tgt 


tac 


gtt 


gat 


ttg 


ggt 


aat 


gaa 


tat 


ccg 


gtt 


ctt 


gtc 


aag 


att 


act 






D50 










A55 










L60 










987 


ctt 


gat 


gaa 


ggt 


cag 


cca 


gec 


tat 


gcg 


cct 


ggt 


cTG 


TAC 


Acc 


gtt 


cat 



BsrGI. . . 



L65 V70 S75 R80 

1035 ctg tec tct ttc aaa gtt ggt cag ttc ggt tec ctt atg att gac cgt 

P85 K87 end of V 

1083 ctg cgc etc gtt ccg get aag TAA C 

1108 ATG gag cag gtc gcg gat ttc gac aca att tat cag gcg atg 
Start gene VII 

1150 ata caa ate tec gtt gta ctt tgt ttc gcg ctt ggt ata ate 

VII and IX overlap. 

S2 V3 L4 V5 S10 

1192 get ggg ggt caa agA TGA gt gtt tta gtg tat tct ttc gee tct ttc gtt 

End VII 
I start IX 

L13 W15 G20 T25 E29 

1242 tta ggt tgg tgc ctt cgt agt ggc att acg tat ttt acc cgt tta atg gaa 

1293 act tec tc 

.... stop of IX, IX and VIII overlap by four bases 
1301 ATG aaa aag tct tta gtc etc aaa gec tct gta gee gtt get acc etc 
Start signal sequence of viii. 

134 9 gtt ccg atg ctg tct ttc get get gag ggt gac gat ccc gca aaa gcg 

mature VIII > 

1397 gec ttt aac tec ctg caa gee tea gcg acc gaa tat ate ggt tat gcg 
1445 tgg gcg atg gtt gtt gtc att 

1466 gtc ggc gca act ate ggt ate aag ctg ttt aag 
1499 aaa ttc acc teg aaa gca ! 1515 
-35 . . 

1517 age tga taaaccgat acaattaaag gctccttttg 

-10 

1552 gagecttttt ttttGGAGAt ttt ! S.D. underlined 

< Hi signal sequence > 



MKKLLFAI PLV 
1575 caac GTG aaa aaa tta tta ttc gca att cct tta gtt ! 1611 





V 


P 


F 


Y 


S 


H 


S 


A 


Q 


















1612 


gtt 


cct 


ttc 


tat 


tct 


cac 


aGT 


gcA Cag 


tCT 






























ApaLI . ■ 


* * 


















1642 




GTC 


GTG 


ACG 


CAG 


CCG 


CCC 


TCA 


GTG 


TCT 


GGG 


GCC 


CCA 


GGG 


CAG 


- 








AGG 


GTC 


ACC 


ATC 


TCC 


TGC 


ACT 


GGG 


AGC 


AGC 


TCC 


AAC 


ATC 


GGG 


GCA 








BstEII . . . 




























1729 




GGT 


TAT 


GAT 


GTA 


CAC 


TGG 


TAC 


CAG 


CAG 


CTT 


CCA 


GGA 


ACA 


GCC 


CCC 


AAA 


1777 




CTC 


CTC 


ATC 


TAT 


GGT 


AAC 


AGC 


AAT 


CGG 


CCC 


TCA 


GGG 


GTC 


CCT 


GAC 


CGA 


1825 




TTC 


TCT 


GGC 


TCC 


AAG 


TCT 


GGC 


ACC 


TCA 


GCC 


TCC 


CTG 


GCC 


ATC 


ACT 




1870 




GGG 


CTC 


CAG 


GCT 


GAG 


GAT 


GAG 


GCT 


GAT 


TAT 














1900 




TAC 


TGC 


CAG 


TCC 


TAT 


GAC 


AGC 


AGC 


CTG 


AGT 














1930 




GGC 


CTT 


TAT 


GTC 


TTC 


GGA 


ACT 


GGG 


ACC 


AAG 


GTC 


ACC 


GTC 






























BstEII. . . 










1969 




CTA 


GGT 


CAG 


CCC 


AAG 


GCC 


AAC 


CCC 


ACT 


GTC 


ACT 












2002 




CTG 


TTC 


CCG 


CCC 


TCC 


TCT 


GAG 


GAG 


CTC 


CAA 


GCC 


AAC 


AAG 


GCC 


ACA 


CTA 


2050 




GTG 


TGT 


CTG 


ATC 


AGT 


GAC 


TTC 


TAC 


CCG 


GGA 


GCT 


GTG 


ACA 


GTG 


GCC 


TGG 


2098 




AAG 


GCA 


GAT 


AGC 


AGC 


CCC 


GTC 


AAG 


GCG 


GGA 


GTG 


GAG 


ACC 


ACC 


ACA 


CCC 


2146 




TCC 


AAA 


CAA 


AGC 


AAC 


AAC 


AAG 


TAC 


GCG 


GCC 


AGC 


AGC 


TAT 


CTG 


AGC 


CTG 


2194 




ACG 


CCT 


GAG 


CAG 


TGG 


AAG 


TCC 


CAC 


AGA 


AGC 


TAC 


AGC 


TGC 


CAG 


GTC 


ACG 


2242 




CAT 


GAA 


GGG 


AGC 


ACC 


GTG 


GAG 


AAG 


ACA 


GTG 


GCC 


CCT 


ACA 


GAA 


TGT 


TCA 



22 90 TAA TAA ACCG CCTCCACCG G GCGCGCCA AT TCTATTTCAA GGAGACAGTC ATA 

AscI 



PelB signal ■ > 

MKYLLPTAAAGLLLL 
2343 ATG AAA TAC CTA TTG CCT ACG GCA GCC GCT GGA TTG TTA TTA CTC 



16 17 18 19 20 21 22 

A A Q P A MA 

2 388 gcG GCC cag ccG G CC atq g cc 

Sfil 

NgoMI. . . (1/2) 

Ncol 



2409 



FR1 (DP47/V3-23) 

23 24 25 26 27 28 29 30 
EVQLLESG 
gaa | gtt I CAA| TTG I tta I gag | tct ! ggt I 
| Mfel | 



10 



2433 



FR1 __ 

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
GGLVQPGGS LRLSCA 
I ggc | ggt | ctt | gtt I cag | cct | ggt | ggt | tct | tta | cgt I ctt I tct I tgc ( get | 



15 



20 

U 25 



FRl > | . . . CDR1 | FR2 

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 
ASGFTFS SYAMSWVR 
2478 I get | TCC I GGA| ttc | act | ttc | tct | tCG | TAC | Get | atg | tct | tgg | gtt | cgC ( 

1 BspEI [ | BsiWII iBstXI. 

FR2 > | . „ . CDR2 

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 
QAPGKGLEWVSAI SG 
2523 | CAa | get | ccT | GGt i aaa 1 ggt | ttg | gag | tgg | gtt I tct I get | ate | tct | ggt | 



.BstXI 



1 



2568 



CDR2 [ FR3— 

76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 
SGGSTYYADSVKGRF 
I tct | ggt | ggc | agt | act | tac | tat | get | gac | tec 1 gtt | aaa | ggt | cgc | ttc | 



30 



35 



FR3 

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 
TISRDNSKNTLYLQM 
2613 | act | ate | TCT f AGA| gac | aac | tct | aag | aat | act | etc | tac | ttg | cag | atg | 
I Xbal | 

— FR3 >| 

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
NSLRAEDTA VYYCAK 
2658 laaciagCt TTA I AGg | get | gag I gac I aCT | GCA I Gtc 1 1 ac | tat I tgc I get | aaa I 



lAflll | 



I PstI ! 



CDR3 | FR4 

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
DYEGTGYAFDIWGQG 
27 03 | gac | tat | gaa I ggt 1 act | ggt I tat | get | ttc | gaC | ATA j TGg | ggt | caa 1 ggt | 

I Ndel | (1/4) 

FR4 > | 
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T M V T V S S 
27 4 8 I act | atG I GTC I ACC I gtc | tct | agt 
I BstEII | 

From BstEII onwards, pV323 is same as pCESl, except as noted. 

BstEII sites may occur in light chains; not likely to be unique in final 

vector . 
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ASTKGPSVFP 
2769 gec tec acc aaG GGC CCa teg GTC TTC ccc 

Bspl20l. Bbsl. . . (2/2) 

Apal .... 
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LAPSSKSTSGGTAAL 
27 9 9 ctg gca ccC TCC TCc aag age acc tct ggg ggc aca gcg gec ctg 
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(Bsu36I. . . ) (knocked out) 

213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 
VPSSSLGTQTYICNV 
2979 gtg ccC tCt tct age tTG Ggc ace cag ace tac ate tgc aac gtg 

(BstXI )N.B. destruction of BstXI & Bpml sit 



228 


229 


230 


231 


232 


233 


234 


235 


236 


237 


238 


239 


240 


241 


242 


N 


H 


K 


P 


S 


N 


T 


K 


V 


D 


K 


K 


V 


E 


P 


aat 


cac 


aag 


ccc 


age 


aac 


acc 


aag 


gtg 


gac 


aag 


aaa 


gtt 


gag 


ccc 


243 


244 


245 


























K 


S 


C 


A 


A 


A 


H 


H 


H 


H 


H 


H 


S 


A 




aaa 


tct 


tgt 


GCG 


GCC 


GCt 


cat 


cac 


cac 


cat 


cat 


cac 


tct 


get 





NotI 

EQKLI S EEDLNGAA 
3111 gaa caa aaa etc ate tea gaa gag gat ctg aat ggt gee gca 



DINDDRM ASGA 

3153 GAT ATC aac gat gat cgt atg get AGC ggc gec 

rEK cleavage site Nhel . . . Kasl... 

EcoRV. . 



Domain 1 

AETVESCLA 
3183 get gaa act gtt gaa agt tgt tta gca 



KPHTEISF 
3210 aaa ccc cat aca gaa aat tea ttt 

TNVWKDDKT 
3234 aCT AAC GTC TGG AAA GAC GAC AAA ACt 

LDRYANYEGCLWNATG 



3261 tta gat cgt tac get aac tat gag ggt tgt ctg tgG AAT GCt aca ggc gtt 
! BsmI 

! VVCTGDETQCYGTWVPI 

3312 gta gtt tgt act ggt GAC GAA ACT CAG TGT TAC GGT ACA TGG GTT cct att 

j 

! G L A I P EN 

3363 ggg ctt get ate cct gaa aat 

! LI linker 

! EGGGSEGGGS 
3384 gag ggt ggt ggc tct gag ggt ggc ggt tct 

i 

! EGGGSEGGGT 
3414 gag ggt ggc ggt tct gag ggt ggc ggt act 

i 
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3444 


aaa 


cct 


cct 


gag 


tac 


ggt 


gat 


aca 


cct 


att 


ccg 


ggc 


tat 


act 


tat 


ate aac 


3495 


cct 


etc 


gac 


ggc 


act 


tat 


ccg 


cct 


ggt 


act 


gag 


caa 


aac 


ccc 


get 


aat cct 


3546 


aat 


cct 


tct 


ctt 


GAG 


GAG 


tct 


cag 


cct 


ctt 


aat 


act 


ttc 


atg 


ttt 


cag aat 












BseRI 






















3597 


aat 


agg 


ttc 


cga 


aat 


agg 


cag 


ggg 


gca 


tta 


act 


gtt 


tat 


acg 


ggc 


act 


3645 


gtt 


act 


caa 


ggc 


act 


gac 


ccc 


gtt 


aaa 


act 


tat 


tac 


cag 


tac 


act 


cct 


3693 


gta 


tea 


tea 


aaa 


gec 


atg 


tat 


gac 


get 


tac 


tgg 


aac 


ggt 


aaa 


ttc 


AGA 
































AlwNI 


3741 


GAC 


TGc 


get 


ttc 


cat 


tct 


ggc 


ttt 


aat 


gaa 


gat 


cca 


ttc 


gtt 


tgt 


gaa 




AlwNI 






























3789 


tat 


caa 


ggc 


caa 


teg 


tct 


gac 


ctg 


cct 


caa 


cct 


cct 


gtc 


aat 


get 





3834 ggc ggc ggc tct 

! start L2 

384 6 ggt ggt ggt tct 
3858 ggt ggc ggc tct 

387 0 gag ggt ggt ggc tct gag ggt ggc ggt tct 
3900 gag ggt ggc ggc tct gag gga ggc ggt tec 
3930 ggt ggt ggc tct ggt ! end L2 

! Domain 3 

• SGDFDYEKMANANKGA 



3945 tec ggt gat ttt gat tat gaa aag atg gca aac get aat aag ggg get 

MTENADENALQS DAKG 
3993 atg acc gaa aat gec gat gaa aac gcg eta cag tct gac get aaa ggc 

KLDSVAT D Y G A A I DGF 
4041 aaa ctt gat tct gtc get act gat tac ggt get get ate gat ggt ttc 

IGDVSGLANGNGATGD 
4089 att ggt gac gtt tec ggc ctt get aat ggt aat ggt get act ggt gat 

FAGS N SQMAQVGDGDN 
4137 ttt get ggc tct aat tec caa atg get caa gtc ggt gac ggt gat aat 

SPLMNNFRQYLPSLPQ 
4185 tea cct tta atg aat aat ttc cgt caa tat tta cct tec etc cct caa 

SVECRPFVFSAGKPYE 
4233 teg gtt gaa tgt cgc cct ttt gtc ttt age get ggt aaa cca tat gaa 

FSIDCDKINLFR 
4281 ttt tct att gat tgt gac aaa ata aac tta ttc cgt 

End Domain 3 

GVFAFLLYVATFMYV F14 0 
4317 ggt gtc ttt gcg ttt ctt tta tat gtt gee acc ttt atg tat gta ttt 
start transmembrane segment 

S T F A N I L 
4365 tct acg ttt get aac ata ctg 

R N K E S 
4386 cgt aat aag gag tct TAA ! stop of iii 
Intracellular anchor. 

Ml P2 V L L5 G I P L L10 L R F L G15 
4404 tc ATG cca gtt ctt ttg ggt att ccg tta tta ttg cgt ttc etc ggt 
Start VI 
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Ml A2 V3 F5 L10 " G13 

4739 aaa TAA t ATG get gtt tat ttt gta act ggc aaa tta ggc tct gga 
end VI Start gene I 
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ilDLCTVSIKKGNSNE 
iv M1 K 

5730 att gat tta tgt act gtt tec att aaa aaa ggt aat tea aAT Gaa 

Start IV 

344 345 346 347 348 349 
i I V K C N .End of I 
iv L3 L N5 V 17 N F VI 0 
57 75 att gtt aaa tgt aat TAA T TTT GTT 
IV continued 

5800 ttc ttg atg ttt gtt tea tea tct tct ttt get cag gta att gaa atg 
584 8 aat aat teg cct ctg cgc gat ttt gta act tgg tat tea aag caa tea 
5896 ggc gaa tec gtt att gtt tct ccc gat gta aaa ggt act gtt act gta 
5944 tat tea tct gac gtt aaa cct gaa aat eta cgc aat ttc ttt att tct 
5992 gtt tta cgt get aat aat ttt gat atg gtt ggt tea att cct tec ata 
6040 att cag aag tat aat cca aac aat cag gat tat att gat gaa ttg cca 
6088 tea tct gat aat cag gaa tat gat gat aat tec get cct tct ggt ggt 
6136 ttc ttt gtt ccg caa aat gat aat gtt act caa act ttt aaa att aat 
6184 aac gtt egg gca aag gat tta ata cga gtt gtc gaa ttg ttt gta aag 



6232 tct aat act tct aaa tec tea aat gta tta tct att gac ggc tct aat 
62 80 eta tta gtt gtt TCT gca cct aaa gat att tta gat aac ctt cct caa 

ApaLI removed 

6328 ttc ctt tct act gtt gat ttg cca act gac cag ata ttg att gag ggt 
637 6 ttg ata ttt gag gtt cag caa ggt gat get tta gat ttt tea ttt get 
6424 get ggc tct cag cgt ggc act gtt gca ggc ggt gtt aat act gac cgc 
6472 etc acc tct gtt tta tct tct get ggt ggt teg ttc ggt att ttt aat 
6520 ggc gat gtt tta ggg eta tea gtt cgc gca tta aag act aat age cat 
6568 tea aaa ata ttg tct gtg cca cgt att ctt acg ctt tea ggt cag aag 
6616 ggt tct ate tct gtT GGC CAg aat gtc cct ttt att act ggt cgt gtg 

MscI 

6664 act ggt gaa tct gec aat gta aat aat cca ttt cag acg att gag cgt 
6712 caa aat gta ggt att tec atg age gtt ttt cct gtt gca atg get ggc 
6760 ggt aat att gtt ctg gat att acc age aag gec gat agt ttg agt tct 
6808 tct act cag gca agt gat gtt att act aat caa aga agt att get aca 
6856 acg gtt aat ttg cgt gat gga cag act ctt tta etc ggt ggc etc act 
6904 gat tat aaa aac act tct caa gat tct ggc gta ccg ttc ctg tct aaa 
6952 ate cct tta ate ggc etc ctg ttt age tec cgc tct gat tec aac gag 
7000 gaa age acg tta tac gtg etc gtc aaa gca acc ata gta cgc gee ctg 
7 04 8 TAG eggegcatt 
End IV 

7060 aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc 
712 0 gcccgctcct ttegctttet tcccttcctt tctcgccacg ttcGCCGGCt ttccccgtca 

NgoMI_ 

7180 agctctaaat cgggggctcc ctttagggtt ccgatttagt getttaegge acctcgaccc 
724 0 caaaaaactt gatttgggtg atggttCACG TAGTGggcca tcgccctgat agacggtttt 

Dralll 

7300 tcgccctttG AC GTT GGAGT Ccacgttctt taatagtgga ctcttgttcc aaactggaac 
DrdI 

7360 aacactcaac cctatctcgg gctattcttt tgatttataa gggattttgc egatttegga 
742 0 accaccatca aacaggattt tcgcctgctg gggcaaacca gcgtggaccg ettgetgeaa 
7480 ctctctcagg gecaggeggt gaagggcaat CAGCTGttgc cCGTCTCact ggtgaaaaga 

PvuII. BsmBI. 

754 0 aaaaccaccc tGGATCC AAGCTT 

BamHI Hindi I I (1/2) 

Insert carrying bla gene 
7563 gcaggtg gcacttttcg gggaaatgtg cgcggaaccc 

7600 ctatttgttt atttttctaa atacattcaa atatGTATCC gctcatgaga caataaccct 

BciVI 



7660 gataaatgct tcaataatat tgaaaaAGGA AGAgt 

RBS . ? . . . 
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BssSI . . . 
ApaLI removed 



7848 


ggt aag 


ate 


ctt 


gag 


agt 


ttt 


cgc 


ccc 


gaa 


gaa 


cgt 


ttt 


cca 


atg 


atg 


age 


7899 


act ttt 


aaa 


gtt 


ctg 


eta 


tgt 


cat 


aca 


eta 


tta 


tec 


cgt 


att 


gac 


gee 


ggg 


7950 


caa gaG 


CAA 


CTC 


GGT 


CGc 


egg 


gcg 


egg 


tat 


tct 


cag 


aat 


gac 


ttg 


gtt 


gAG 




Bcgl 




























Sec 


8001 


TAC Tea 


cca 


gtc 


aca 


gaa 


aag 


cat 


ctt 


acg 


gat 


ggc 


atg 


aca 


gta 


aga 


gaa 




Scal_ 
































8052 


tta tgc 


agt 


get 


gee 


ata 


acc 


atg 


agt 


gat 


aac 


act 


gcg 


gec 


aac 


tta 


ctt 


8103 


ctg aca 


aCG 


ATC 


Gga 


gga 


ccg 


aag 


gag 


eta 


acc 


get 


ttt 


ttg 


cac 


aac 


atg 






Pvul 




























8154 


ggg gat 


cat 


gta 


act 


cgc 


ctt 


gat 


cgt 


tgg 


gaa 


ccg 


gag 


ctg 


aat 


gaa 


gec 


8205 


ata cca 


aac 


gac 


gag 


cgt 


gac 


acc 


a eg 


atg 


cct 


gta 


gca 


atg 


cca 


aca 


acg 


8256 


tTG CGC 


Aaa 


eta 


tta 


act 


ggc 


gaa 


eta 


ctt 


act 


eta 


get 


tec 


egg 


caa 


caa 




Fspl . . , 
































8307 


tta ata 


gac 


tgg 


atg 


gag 


gcg 


gat 


aaa 


gtt 


gca 


gga 


cca 


ctt 


ctg 


cgc 


teg 


8358 


GCC ctt 


ccG 


GCt 


gg c 


tgg 


ttt 


att 


get 


gat 


aaa 


tct 


gga 


gec 


ggt 


gag 


cgt 




Bgll 
































8409 


gGG TCT 


Cgc 


ggt 


ate 


att 


gca 


gca 


ctg 


ggg 


cca 


gat 


ggt 


aag 


ccc 


tec 


cgt 




Bsal 
































8460 


ate gta 


gtt 


ate 


tac 


acG ACg 


ggg 


aGT 


Cag 


gca 


act 


atg 


gat 


gaa 


cga 


aat 



Ahdl 



8511 aga cag ate get gag ata ggt gec tea ctg att aag cat tgg TAA ctgt 

stop 

8560 cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt taatttaaaa 
8620 ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa cgtgagtttt 
8680 cgttccactg taegtaagae cccc 

8704 AAGCTT GTC GAC tgaa tggcgaatgg cgctttgcct 
Hindlll Sail.. 
(2/2) Hindi 

8740 ggtttccggc accagaagcg gtgccggaaa gctggctgga gtgegatett 



87 90 CCTGAGG 

Bsu36I_ 

8797 ccgat actgtcgtcg tcccctcaaa ctggcagatg 

8832 cacggttacg atgcgcccat ctacaccaac gtaacctatc 

88 92 tttgttccca cggagaatcc gacgggttgt tactcgctca 
8 952 tggctacagg aaggccagac gcgaattatt tttgatggcg 
9012 agctgattta acaaaaattt aacgcgaatt ttaacaaaat 

9072 Tatttgctta tacaatcttc ctgtttttgg ggcttttctg 



9131 


ATG 


att 


gac 


atg 


eta 


gtt 


tta 


cga 


tta 


ccg 


ttc 




Start gene II 
















9182 


tec 


aga 


etc 


tea 


ggc 


aat 


gac 


ctg 


ata 


gec 


ttt 


9233 


get 


acc 


etc 


tec 


ggc 


atg 


aat 


tta 


tea 


get 


aga 


9284 


gat 


ggt 


gat 


ttg 


act 


gtc 


tec 


ggc 


ctt 


tct 


cac 


9335 


aca 


cat 


tac 


tea 


ggc 


att 


gca 


ttt 


aaa 


ata 


tat 


9386 


tat 


cct 


tgc 


gtt 


gaa 


ata 


aag 


get 


tct 


ccc 


gca 


9437 


aat 


gtt 


ttt 


ggt 


aca 


acc 


gat 


tta 


get 


tta 


tgc 


9488 


aat 


ttt 


get 


aat 


tct 


ttg 


cct 


tgc 


ctg 


tat 


gat 



gene II continues 



ecattaeggt caatccgccg 
catttaatgt tgatgaaagc 
ttcctattgg ttaaaaaatg 
attaacgttt a c aATTTAAA 
Swal . . . 
attatcaacc GGGGTAcat 
RBS? 

ate gat tct ctt gtt tgc 

gtA GAT CTc tea aaa ata 

Bglll. . . 
acg gtt gaa tat cat att 
cct ttt gaa tct tta cct 
gag ggt tct aaa aat ttt 
aaa gta tta cag ggt cat 
tct gag get tta ttg ctt 
tta ttg gat gtt ! 9532 



Table 120B: Sequence of MALIA3, condensed 

LOCUS MALI A3 9532 CIRCULAR 

ORIGIN 



1 


AATGCTACTA 


CTATTAGTAG 


AATTGATGCC 


ACCTTTTCAG 


CTCGCGCCCC 


AAATGAAAAT 


61 


ATAGCTAAAC 


AGGTTATTGA 


CCATTTGCGA 


AATGTATCTA 


ATGGTCAAAC 


T AAAT CT AC T 


121 


CGTTCGCAGA 


AT T GGGAATC 


AACT GTTACA 


TGGAATGAAA 


CTT CCAGACA 


CCGTACTTTA 


181 


GTTG CATATT 


TAAAACATGT 


TGAGCTACAG 


CACCAGATTC 


AG CAATT AAG 


CTCTAAGCCA 


241 


TCCGCAAAAA 


TGACCTCTTA 


TCAAAAGGAG 


CAATTAAAGG 


TACT CTCTAA 


TCCTGACCTG 


301 


TTGGAGTTTG 


CTTCCGGTCT 


GGTTCGCTTT 


GAAGCTCGAA 


TTAAAACGCG 


ATATT TGAAG 


361 


TCTTTCGGGC 


TTCCTCTTAA 


TCTTTTTGAT 


GCAATCCGCT 


TTGCTTCTGA 


CTATAATAGT 


421 


CAGG GTAAAG 


ACCTGATTTT 


T GAT TT AT GG 


TCATTCTCGT 


TTTCTGAACT 


GTTTAAAGCA 


481 


TTTGAGGGGG 


ATTCAATGAA 


TATTTATGAC 


GAT T C C GC AG 


TATT GGACGC 


TATCCAGTCT 


541 


AAACATTTTA 


CTATTACCCC 


CTCTGGCAAA 


ACTTCTTTTG 


CAAAAGCCTC 


TCGCTATTTT 


601 


GGTTTTTATC 


GTCGTCTGGT 


AAACGAGGGT 


TATGATAGTG 


TTGCTCTTAC 


TATGCCTCGT 


661 


AATTCCTTTT 


GGCGTTATGT 


ATCTGCATTA 


GTTGAATGTG 


GTATTCCTAA 


AT CT CAACT G 


721 


ATGAATCTTT 


CTACCTGTAA 


TAATGTTGTT 


CCGTTAGTTC 


GTTTTATTAA 


CGTAGATTTT 


781 


TCTTCCCAAC 


GTCCTGACTG 


GTATAATGAG 


CCAGTTCTTA 


AAAT CG CAT A 


AGGTAATTCA 


841 


CAATGATTAA 


AGT T GAAAT T 


AAAC CAT CT C 


AAGCCCAATT 


TACT ACT CGT 


TCTGGTGTTT 


901 


CTCGTCAGGG 


CAAGCCTTAT 


TCACTGAATG 


AGCAGCTTTG 


TTACGTTGAT 


TTGGGTAATG 


961 


AATATCCGGT 


TCTTGTCAAG 


ATTACTCTTG 


AT GAAGGT CA 


GCCAGCCTAT 


GCGCCTGGTC 


1021 


TGTACAC CGT 


TCATCTGTCC 


TCTTTCAAAG 


TTGGTCAGTT 


CGGTTCCCTT 


ATGATTGACC 


1081 


GTCTGCGCCT 


CGTTCCGGCT 


AAGTAACATG 


GAGCAGGTCG 


CGGATTTCGA 


CACAATTTAT 


1141 


CAGGCGATGA 


TACAAATCTC 


CGTTGTACTT 


TGTTTCGCGC 


TTGGTATAAT 


CGCTGGGGGT 


1201 


CAAAGATGAG 


TGTTTTAGTG 


TATT CTTTCG 


CCTCTTTCGT 


TTTAGGTTGG 


TGCCTTCGTA 


1261 


GTGGCATTAC 


GTATTTTACC 


CGTTTAATGG 


AAACTTCCTC 


AT GAAAAAGT 


CTTTAGTCCT 


1321 


CAAAGCCTCT 


GTAGCCGTTG 


CTACCCTCGT 


TCCGATGCTG 


TCTTTCGCTG 


CTGAGGGTGA 


1381 


CGATCCCGCA 


AAAGCGGCCT 


TTAACTCCCT 


GCAAGCCTCA 


GCGACCGAAT 


ATATCGGTTA 


1441 


TGCGT GGGCG 


ATGGTTGTTG 


TCATTGTCGG 


CGCAACTATC 


GGTATCAAGC 


T GT T T AAGAA 


1501 


ATTCACCTCG 


AAAGCAAGCT 


GATAAACCGA 


TACAATTAAA 


GGCTCCTTTT 


GGAGCCTTTT 


1561 


TTTTTGGAGA 


TTTTCAACGT 


GAAAAAATTA 


TTATTCGCAA 


TTCCTTTAGT 


TGTTCCTTTC 


1621 


TATTCTCACA 


GTGCACAGTC 


TGTCGTGACG 


CAGCCGCCCT 


CAGTGTCTGG 


GGCCCCAGGG 


1681 


CAGAGGGTCA 


CCATCTCCTG 


CACT GGGAGC 


AGCTCCAACA 


TCGGGGCAGG 


T TATGAT GT A 


1741 


CACTGGTACC 


AGCAGCTTCC 


AGGAACAGCC 


CCCAAACTCC 


TCATCTAT GG 


TAACAGCAAT 


1801 


CGGCCCTCAG 


GGGTCCCTGA 


CCGATTCTCT 


GGCTCCAAGT 


CTGGCACCTC 


AGCCTCCCTG 


1861 


GCCATCACTG 


GGCTCCAGGC 


T GAGGAT GAG 


GCTGATTATT 


ACTGCCAGTC 


CTATGACAGC 


1921 


AGCCTGAGTG 


GCCTTTATGT 


CTTCGGAACT 


GGGACCAAGG 


TCACCGTCCT 


AGGTCAGCCC 


1981 


AAGGC CAACC 


CCACTGTCAC 


TCTGTTCCCG 


CCCTCCTCTG 


AGGAGCT CCA 


AGC CAACAAG 


2041 


GCCACACTAG 


TGTGTCTGAT 


CAGTGACTTC 


TACCCGGGAG 


CTGTGACAGT 


GGCCTGGAAG 


2101 


GCAGATAGCA 


GCCCCGTCAA 


GGCGGGAGTG 


GAGACCACCA 


CACCCTCCAA 


ACAAAGCAAC 







2161 


AACAAGT AC G 


CGGCCAGCAG 


CT AT CT GAG C 


CTGACGCCTG 


AGCAGTGGAA 


GTCCCACAGA 






2221 


AGCTACAGCT 


GCCAGGTCAC 


G CAT GAAGG G 


AGCACCGTGG 


AGAAGACAGT 


GGCCCCTACA 






2281 


GAATGTTCAT 


AATAAACCGC 


CTCCACCGGG 


CGCGCCAATT 


CTATTTCAAG 


GAGACAGTCA 






2341 


TAATGAAATA 


CCTATTGCCT 


ACGGCAGCCG 


CTGGATTGTT 


ATTACTCGCG 


GCCCAGCCGG 




5 


2401 


CCATGGCCGA 


AGTTCAATTG 


TTAGAGTCTG 


GTGGCGGTCT 


TGTTCAGCCT 


GGTGGTTCTT 






2461 


TACGTCTTTC 


TTGCGCTGCT 


TCCGGATTCA 


CTTTCTCTTC 


GTACGCTATG 


TCTTGGGTTC 






2521 


GCCAAGCTCC 


TGGTAAAGGT 


TTGGAGTGGG 


TTTCTGCTAT 


CTCTGGTTCT 


GGT GGCAGTA 






2581 


CTTACTATGC 


TGACTCCGTT 


AAAGGTCGCT 


TCACTATCTC 


TAGAGACAAC 


TCTAAGAATA 






2641 


CTCTCTACTT 


GCAGATGAAC 


AGCTTAAGGG 


CTGAGGACAC 


TGCAGTCTAC 


TATTGCGCTA 




JO 


2701 


AAGACTATGA 


AGGTACTGGT 


TATGCTTTCG 


ACATATGGGG 


TCAAGGTACT 


AT GGT CAC CG 






2761 


TCTCTAGTGC 


CTCCACCAAG 


GGCCCATCGG 


TCTTCCCCCT 


GGCACCCTCC 


TCCAAGAGCA 






2821 


CCTCTGGGGG 


CACAGCGGCC 


CTGGGCTGCC 


TGGTCAAGGA 


CTACTTCCCC 


GAACCGGTGA 






2881 


CGGTGTCGTG 


GAACTCAGGC 


GCCCTGACCA 


GCGGCGTCCA 


CACCTTCCCG 


GCTGTCCTAC 






2941 


AGTCTAGCGG 


ACTCTACTCC 


CTCAGCAGCG 


TAGTGACCGT 


GCCCTCTTCT 


AGCTT GGGCA 




15 


3001 


CCCAGACCTA 


CAT CT GCAAC 


GTGAATCACA 


AGCCCAGCAA 


CACCAAGGTG 


GACAAGAAAG 






3061 


TTGAGCCCAA 


ATCTTGTGCG 


GCCGCTCATC 


ACCACCATCA 


TCACTCTGCT 


GAACAAAAAC 






3121 


TCATCTCAGA 


AGAGGATCTG 


AATGGTGCCG 


CAGATAT CAA 


CGATGATCGT 


ATGGCTGGCG 






3181 


CCGCTGAAAC 


TGTTGAAAGT 


TGTTTAGCAA 


AACCCCATAC 


AGAAAATTCA 


TTTACTAACG 






3241 


TCTGGAAAGA 


CGACAAAACT 


TTAGATCGTT 


ACGCTAACTA 


TGAGGGTTGT 


CTGTGGAATG 




20 


3301 


CTACAGGCGT 


TGTAGTTTGT 


ACTGGTGACG 


AAACTCAGTG 


TTACG GTACA 


TGGGTTCCTA 






3361 


TTGGGCTTGC 


TATCCCTGAA 


AATGAGGGTG 


GTGGCTCTGA 


GGGTGGCGGT 


TCTGAGGGTG 


■» ■■ 




3421 


GCGGTTCTGA 


GGGTGGCGGT 


ACTAAACCTC 


CTGAGTACGG 


T GATACAC CT 


ATTCCGGGCT 






3481 


ATACTTATAT 


CAACCCTCTC 


GACGGCACTT 


ATCCGCCTGG 


TACTGAGCAA 


AACCCCGCTA 






3541 


ATCCTAATCC 


TTCTCTTGAG 


GAGTCTCAGC 


CTCTTAATAC 


TTTCATGTTT 


CAGAATAATA 


M 


25 


3601 


GGTTCCGAAA 


TAGGCAGGGG 


GCATTAACTG 


TTTATACGGG 


CACT GTTACT 


CAAGGCACTG 


? "i 




3661 


ACCCCGTTAA 


AACTTATTAC 


CAGTACACTC 


CTGTATCATC 


AAAAGCCATG 


TATGACGCTT 




3721 


ACT G GAACGG 


TAAATTCAGA 


GACTGCGCTT 


TCCATT CTGG 


CTTTAATGAA 


GATCCATTCG 






3781 


TTTGTGAATA 


TCAAGGCCAA 


TCGTCTGACC 


TGCCTCAACC 


TCCTGTCAAT 


GCTGGCGGCG 






3841 


GCTCTGGTGG 


TGGTTCTGGT 


GGCGGCTCTG 


AGGGTGGTGG 


CTCTGAGGGT 


GGCGGTTCTG 




30 


3901 


AGGGTGGCGG 


CTCTGAGGGA 


GGCGGTT CCG 


GTGGTGGCTC 


TGGTTCCGGT 


GATTTTGATT 






3961 


AT G AAAAGAT 


GGCAAACGCT 


AATAAGGGGG 


CTATGAC CGA 


AAATGCCGAT 


GAAAACGCGC 






4021 


TACAGTCTGA 


CGCTAAAGGC 


AAACTTGATT 


CTGTCGCTAC 


TGATTACGGT 


GCTGCTATCG 






4081 


ATGGTTTCAT 


TGGTGACGTT 


TCCGGCCTTG 


CTAATGGTAA 


TGGTGCTACT 


GGTGATTTTG 






4141 


CTGGCTCTAA 


TTCCCAAATG 


GCTCAAGTCG 


GTGACGGTGA 


TAATTCACCT 


TTAATGAATA 




35 


4201 


ATTTCCGTCA 


ATATTTACCT 


TCCCTCCCTC 


AATCGGTTGA 


ATGTCGCCCT 


TTTGTCTTTA 






4261 


GCGCTGGTAA 


AC CAT AT GAA 


TTTTCTATTG 


ATTGT GACAA 


AATAAACTTA 


TTCCGTGGTG 






4321 


TCTTTGCGTT 


T CT T T TAT AT 


GTTGCCACCT 


TTATGTATGT 


AT T TT CT AC G 


TTTGCTAACA 






4381 


TACT G CGTAA 


TAAGGAGTCT 


TAATCATGCC 


AGTTCTTTTG 


GGTATTCCGT 


TAT TAT T G CG 






4441 


TTTCCTCGGT 


TTCCTTCTGG 


TAACTTTGTT 


CGGCTATCTG 


CTTACTTTTC 


TTAAAAAGGG 







4501 


CTTCGGTAAG 


ATAGCTATTG 






4561 


AATTCTTGTG 


GGTTATCTCT 






4621 


TGTTCAGTTA ATTCTCCCGT 






4681 


GGCTGCTATT 


TTCATTTTTG 




5 


4741 


ATAATAT GGC 


TGTTTATTTT 






4801 


TTGGTAAGAT 


TCAGGATAAA 






4861 


GGCTTCAAAA 


CCTCCCGCAA 






4921 


CGGATAAGCC 


TTCTATATCT 






4981 


AAAATAAAAA 


CGGCTTGCTT 




JO 


5041 


GGAATGATAA 


GGAAAGACAG 






5101 


GGGATATTAT 


TTTTCTTGTT 






5161 


TAGCT GAACA 


TGTTGTTTAT 






5221 


CTTTATATTC 


TCTTATTACT 






5281 


TTAAATATGG 


CGATTCTCAA 




15 


5341 


ATTTGTATAA 


C GCATAT GAT 






5401 


ATTCTTATTT 


AACGCCTTAT 


\=£ 




5461 


AGAAGAT G AA 


ATTAAC TAAA 






5521 


TTGGATTTGC 


ATCAGCATTT 


iy 




5581 


AGGTAGTCTC 


T CAG AC C TAT 


20 


5641 


AT CTAAGCTA 


TCGCTATGTT 


C3 




5701 


T AC AGAAG C A 


AGGTTATTCA 


V * 




5761 


GTAATTCAAA 


TGAAATTGTT 


3 




5821 


TCTTCTTTTG 


CT CAGGTAAT 


C3 




5881 


TATTCAAAGC 


AATCAGGCGA 


— — 


25 


5941 


GTATATTCAT 


CTGAGGTTAA 






6001 


GCTAATAATT 


TTGATAT GGT 




6061 


AATCAGGATT 


ATATTGATGA 






6121 


GCTCCTTCTG 


GTGGTTTCTT 






6181 


AATAACGTTC 


GGGCAAAGGA 




30 


6241 


TCTAAATCCT 


CAAATGTATT 






6301 


AAAGATATTT 


TAGATAACCT 






6361 


AT ATT GAT T G 


AGGGTTTGAT 






6421 


GCTGCTGGCT 


CTCAGCGTGG 






6481 


GTTTTATCTT 


CTGCTGGTGG 




35 


6541 


GTTCGCGCAT 


TAAAGACTAA 






6601 


CTTTCAGGTC 


AGAAG GGTTC 






6661 


GTGACTGGTG 


AATCTGCCAA 






6721 


GGTATTTCCA 


TGAGCGTTTT 






6781 


AC CAG CAAGG 


C C GATAGT TT 



CTATTTCATT 


GTTTCTTGCT 


CTTATTATTG 


GGCTTAACTC 


CT GATATTAG 


CGCTCAATTA 


CCCTCTGACT 


TTGTTCAGGG 


CTAATGCGCT 


TCCCTGTTTT 


TATGTTATTC 


TCTCTGTAAA 


ACGT TAAACA 


AAAAATCGTT 


TCTTATTTGG 


ATT GG GAT AA 


GTAACT GGCA 


AATTAGGCTC 


TGGAAAGACG 


CTCGTTAGCG 


ATT GTAGCT G 


GGTGCAAAAT 


AGCAACTAAT 


CTTGATTTAA 


GTCGGGAGGT 


TCGCTAAAAC 


GCCTCGCGTT 


CTTAGAATAC 


GATTTGCTTG 


CTATTGGGCG 


CGGTAAT GAT 


TCCTACGATG 


GTTCTCGATG 


AGTGCGGTAC 


TTGGTTTAAT 


ACCCGTTCTT 


CC GAT TAT T G 


ATTGGTTTCT 


ACATGCTCGT 


AAATTAGGAT 


CAGGACTTAT 


CTATTGTTGA 


TAAACAGGCG 


CGTTCTGCAT 


TGTCGTCGTC 


TGGACAGAAT 


TACTTTACCT 


TTTGTCGGTA 


GGCTCGAAAA 


TGCCTCTGCC 


TAAATTACAT 


GTTGGCGTTG 


TTAAGCCCTA 


CTGTTGAGCG 


TTGGCTTTAT 


ACT GGT AAGA 


ACTAAACAGG 


CTTTTTCTAG 


TAAT TAT GAT 


TCCGGTGTTT 


T TAT GACAC G 


GTCGGTATTT 


CAAAC CAT T A 


AATTTAGGTC 


ATATATTTGA 


AAAAGTTTTC 


TCGCGTTCTT 


TGTCTTGCGA 


ACATATAGTT 


ATATAACCCA 


AC CTAAGCCG 


GAGGTTAAAA 


GATTTTGATA 


AATTCACTAT 


TGACTCTTCT 


CAGCGTCTTA 


TTCAAGGATT 


CTAAGGGAAA 


AT TAAT TAAT 


AGCGACGATT 


CT CACAT AT A 


TTGATTTATG 


TACTGTTTCC 


ATTAAAAAAG 


AAATGTAATT 


AATTTTGTTT 


TCTTGATGTT 


TGTTTCATCA 


TGAAAT GAAT 


AATTCGCCTC 


TGCGCGATTT 


T GTAACT TGG 


ATCCGTTATT 


GTTTCTCCCG 


AT GTAAAAGG 


TACT GTTACT 


ACCTGAAAAT 


CTACGCAATT 


TCTTTATTTC 


TGTTTTACGT 


TGGTTCAATT 


CCTTCCATAA 


TT CAGAAGTA 


TAAT C CAAAC 


ATTGCCATCA 


T CT GATAATC 


AG GAATAT GA 


TGATAATTCC 


TGTTCCGCAA 


AATGATAATG 


TT ACT CAAAC 


TTTTAAAATT 


TTTAATACGA 


GTTGTCGAAT 


TGTTTGTAAA 


GT C TAAT AC T 


ATCTATTGAC 


GGCTCTAATC 


TATTAGTTGT 


TTCTGCACCT 


TCCTCAATTC 


CTTTCTACTG 


TTGATTTGCC 


AACT GACCAG 


ATTTGAGGTT 


CAGCAAGGTG 


AT GCTTTAGA 


TTTTTCATTT 


CACTGTTGCA 


GGCGGTGTTA 


ATACTGACCG 


CCTCACCTCT 


TTCGTTCGGT 


ATTTTTAATG 


GCGATGTTTT 


AGGGCTATCA 


TAGCCATTCA 


AAAATATTGT 


CTGTGCCACG 


TAT TCTTAC G 


TATCTCTGTT 


GGC CAGAAT G 


TCCCTTTTAT 


TACTGGTCGT 


TGTAAATAAT 


CCATTTCAGA 


CGATTGAGCG 


TCAAAATGTA 


TCCTGTTGCA 


ATGGCTGGCG 


GTAATATTGT 


TCTGGATATT 


GAGTTCTTCT 


ACT CAGGCAA 


GT GATGTT AT 


TACTAATCAA 



6841 AGAAGTATTG CTACAACGGT TAATTTGCGT GAT G GACAGA CTCTTTTACT CGGTGGCCTC 
6901 ACTGATTATA AAAACACTTC TCAAGATTCT GGCGTACCGT TCCTGTCTAA AATCCCTTTA 
6961 ATCGGCCTCC TGTTTAGCTC CCGCTCTGAT TCCAACGAGG AAAGCACGTT ATACGTGCTC 
7021 GTCAAAGCAA C CAT AGTAC G CGCCCTGTAG CGGCGCATTA AGCGCGGCGG GTGTGGTGGT 
7081 TACGCGCAGC GTGACCGCTA CACTTGCCAG CGCCCTAGCG CCCGCTCCTT TCGCTTTCTT 
7141 CCCTTCCTTT CTCGCCACGT TCGCCGGCTT TCCCCGTCAA GCTCTAAATC GGGGGCTCCC 
7201 TTTAGGGTTC CGATTTAGTG CTTTACGGCA CCTCGACCCC AAAAAACTTG ATTTGGGTGA 
7261 TGGTTCACGT AGTGGGCCAT CGCCCTGATA GACGGTTTTT CGCCCTTTGA CGTTGGAGTC 
7321 CACGTTCTTT AATAGTGGAC TCTTGTTCCA AACTGGAACA ACACTCAACC CTATCTCGGG 
7381 CTATTCTTTT GATTTATAAG GGATTTTGCC GATTTCGGAA CCACCATCAA ACAG GATTTT 
7441 CGCCTGCTGG GGCAAACCAG CGTGGACCGC TTGCTGCAAC TCTCTCAGGG CCAGGCGGTG 
7501 AAGGGCAATC AGCTGTTGCC CGTCTCACTG GTGAAAAGAA AAACCACCCT GGATCCAAGC 
7561 TTGCAGGTGG CACTTTTCGG GGAAATGTGC GCGGAACCCC TATTTGTTTA TTTTTCTAAA 
7621 TACATTCAAA TATGTATCCG CT CAT GAGAC AATAACCCTG ATAAAT GCTT CAATAATATT 
7681 GAAAAAGGAA GAGT AT GAGT ATT CAACATT TCCGTGTCGC CCTTATTCCC TTTTTTGCGG 
7741 CATTTTGCCT TCCTGTTTTT GCTCACCCAG AAACGCTGGT GAAAGTAAAA GATGCTGAAG 
7801 AT CAGTT GGG CGCACGAGTG GGTTACATCG AACTGGATCT CAACAGC G GT AAGATCCTTG 
7861 AGAGTTTTCG CCCCGAAGAA CGTTTTCCAA TGATGAGCAC TTTTAAAGTT CTGCTATGTC 
7921 ATACACTATT AT C CC GT AT T GACGCCGGGC AAGAGCAACT CGGTCGCCGG GCGCGGTATT 
7981 CTCAGAATGA CTTGGTTGAG TACT CACCAG TCACAGAAAA GCATCTTACG GATGGCATGA 
8041 CAGTAAGAGA ATT AT GCAGT GCTGCCATAA CCATGAGTGA TAACACTGCG GCCAACTTAC 
8101 TTCTGACAAC GAT CGGAGGA CCGAAGGAGC TAACCGCTTT TTTGCACAAC ATGGGGGATC 
8161 AT GTAACT C G CCTTGATCGT TGGGAACCGG AGCT GAATGA AG C CATACCA AACGACGAGC 
8221 GTGACACCAC GATGCCTGTA GCAATGCCAA CAACGTTGCG CAAACTATTA ACTGGCGAAC 
8281 TACT TACT CT AGCTTCCCGG CAACAATTAA TAGACTGGAT GGAGGCGGAT AAAGTTGCAG 
8341 GACCACTTCT GCGCTCGGCC CTTCCGGCTG GCTGGTTTAT TGCTGATAAA TCTGGAGCCG 
84 01 GTGAGCGTGG GTCTCGCGGT ATCATTGCAG CACTGGGGCC AGAT GGTAAG CCCTCCCGTA 
84 61 T CGTAGT TAT CTACACGACG GGGAGT CAGG CAACTATGGA TGAACGAAAT AG ACAGAT C G 
8521 CTGAGATAGG TGCCTCACTG ATTAAGCATT G GTAACT GT C AGACCAAGTT TACTCATATA 
8581 TACTTTAGAT TGATTTAAAA CTTCATTTTT AATTTAAAAG GATCTAGGTG AAGATCCTTT 
8 641 TTGATAATCT CATGACCAAA ATCCCTTAAC GTGAGTTTTC GTTCCACTGT ACGTAAGACC 
8701 CCCAAGCTTG TCGACT GAAT GGCGAATGGC GCTTTGCCTG GTTTCCGGCA CCAGAAGCGG 
87 61 TGCCGGAAAG CTGGCTGGAG TGCGATCTTC CTGAGGCCGA TACTGTCGTC GTCCCCTCAA 
8821 ACT GGCAGAT GCACGGTTAC GATGCGCCCA TCTACACCAA CGTAACCTAT CCCATTACGG 
8881 TCAATCCGCC GTTTGTTCCC ACGGAGAATC CGACGGGTTG TTACTCGCTC ACATTTAATG 
8941 TTGAT GAAAG CTGGCTACAG GAAGGCCAGA CGC GAATTAT TTTTGATGGC GTTCCTATTG 
9001 GTTAAAAAAT GAGCTGATTT AACAAAAATT TAACGC GAAT TTTAACAAAA TATTAACGTT 
9061 TACAATTTAA ATATTTGCTT AT ACAAT CT T CCTGTTTTTG GGGCTTTTCT GATTATCAAC 
9121 C GGG GTACAT ATGATTGACA TGCTAGTTTT ACGATTACCG T T CAT CGAT T CTCTTGTTTG 



9181 CTCCAGACTC TCAGGCAATG AC CT GAT AG C 
9241 CTCCGGCATG AATTTAT CAG CTAGAACGGT 
9301 CTCCGGCCTT TCTCACCCTT TTGAATCTTT 
9361 AAT AT AT GAG GGTTCTAAAA ATTTTTATCC 
9421 AGTATTACAG GGTCATAATG TTTTTGGTAC 
9481 ATTGCTTAAT TTT GCTAATT CTTTGCCTTG 



CTTTGTAGAT CTCTCAAAAA TAGCTACCCT 
TGAATATCAT ATT GATGGT G ATTTGACTGT 
ACCTACACAT TACT CAGGCA TTGCATTTAA 
TTGCGTTGAA ATAAAGGCTT CTCCCGCAAA 
AACCGATTTA GCTTTATGCT CTGAGGCTTT 
CCTGTATGAT TTATTGGATG TT 



Table 200: Enzymes that either cut 15 or more human GLGs or have 5-Kbase recognition in FR3 
Typical entry: 

REname Recognition #sites 

GLGid# : base# GLGid# : base# GLGid# : base# 

5 

BstEII Ggtnacc 2 
1: 3 48: 3 
There are 2 hits at base# 3 

10 Maelll gtnac 36 
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There are 21 hits at base# 4 



HphI tcacc 4 5 
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30 
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8: 


5 
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11 
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There are 44 hits at base# 5 



Nlalll CATG 26 
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42 
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78 
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42 
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21 
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42 
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42 
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42 
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42 
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57 
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48 
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57 
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57 
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72 
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78 
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78 



















There are 11 hits at base# 42 



There are 1 hits at base# 48 Could cause raggedness. 



BsaJI Ccnngg 37 
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51: 14 

There are 23 hits at base# 65 

There are 14 hits at base# 14 
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52 


40: 


47 
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52 


43: 


47 


43: 


52 



42 
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9: 


47 


10: 


47 


25: 


63 


31: 


63 


38: 


47 


38: 


52 


41: 


47 


41: 


52 


44 : 


47 


44: 


52 



5: 
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6: 


47 


11: 


47 


16: 


63 


32: 


63 


36: 


63 
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47 


39: 


52 


42: 


47 


42: 


52 


45: 


47 


45: 


52 



46: 47 46: 52 47: 47 47: 52 49: 15 .50: 47 
There are 23 hits at baset 47 

There are 11 hits at base# 52 Only 5 bases from 4 7 



BlpI GCtnagc 21 

1: 48 2: 48 3: 48 5: 48 6: 48 7: 48 
8: 48 9z 48 10: 48 11: 48 37: 48 38: 48 
39: 48 40: 48 41: 48 42: 48 43: 48 44: 48 
45: 48 46: 48 47: 48 
There are 21 hits at base# 48 



Mwol GCNNNNNnngc 19 
1: 48 2: 28 19: 36 22: 36 
25: 36 26: 36 35: 36 37: 67 
41: 67 42: 67 43: 67 44: 67 
47: 67 

There are 10 hits at base# 67 
There are 7 hits at base# 36 



23: 36 
39: 67 
45: 67 



24: 36 
40: 67 
46: 67 
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35 



Ddel Ctnag 
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58 
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58 
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65 
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65 
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58 


24: 


65 
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58 



25: 65 
31: 58 



26: 
31: 



58 
65 



27: 58 



27: 65 



32: 58 



32: 65 



36: 65 



37: 49 
42: 26 
47: 49 



38: 49 
42: 49 
48: 12 



39: 26 
43: 49 
49: 12 



28: 
35: 
39: 
44: 
51: 



58 
58 
49 
49 
65 



30: 58 

36: 58 

40: 49 

45: 49 



41: 49 
46: 49 

There axe 29 hits at base# 58 

There are 22 hits at base# 49 Only nine base from 58 
There are 16 hits at base# 65 Only seven bases from 58 



Bglll Agate t 11 

1: 61 2: 61 3: 61 4: 61 
7: 61 9: 61 10: 61 11: 61 
There are 10 hits at base# 61 



5: 
51: 



61 
47 



6: 61 



BstYI Rgatcy 

1: 61 2: 61 
7: 61 8: 61 



12 

3: 61 4: 61 
9: 61 10: 61 



5: 61 
11: 61 



6: 61 
51: 47 



There are 11 hits at base# 61 



HpylSBl TCNga 17 
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57 


35: 57 


48: 


67 
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67 







5 There are 11 hits at base# 64 
There are 4 hits at base! 57 

There are 2 hits at base# 67 Could be ragged. 



MslI CAYNNnnRTG 44 
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There are 44 hits at base# 72 



20 BsxEI CGRYcg 23 

1: 74 3: 74 4: 74 5: 74 

9: 74 10: 74 11: 74 17: 74 

33: 74 34: 74 37: 74 38: 74 

41: 74 42: 74 45: 74 46: 74 

25 There are 23 hits at base# 74 

Eael Yggccr 23 

1: 74 3: 74 4: 74 5: 74 

9: 74 10: 74 11: 74 17: 74 

SO 33: 74 34: 74 37: 74 38: 74 

41: 74 42: 74 45: 74 46: 74 

There are 23 hits at base# 74 

EagI Cggccg 23 

35 1: ? 4 3: 74 4: 74 5: 74 7: 74 8: 74 

9: 74 10: 74 11: 74 17: 74 22: 74 30: 74 



7: 74 8: 74 

22: 74 30: 74 

39: 74 40: 74 

47: 74 



7: 74 8: 74 

22: 74 30: 74 

39: 74 40: 74 

47: 74 



33: 74 34: 74 37: 74 38: 74 
41: 74 42: 74 45: 74 46: 74 
There are 23 hits at fc>ase# 74 



39: 74 40: 74 
47: 74 
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There are 25 hits at basef 75 
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There are 51 hits at base# 86 All the other sites are well away 
HpyCH4III ACNgt 63 
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44: 86 45: 86 46: 86 47: 86 
50: 86 51: 0 51: 86 
There are 51 Hits at base# 86 
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49: 86 
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There are 18 hits at base# 2 
PleX gagtc 18 



There are 

Acil Ccgc 

2: 26 9: 14 
37: 65 38: 62 
43: 62 



18 hits at base! 2 

24 
11: 14 



42: 65 

46: 62 47: 62 
There are 
There are 

There are 
There are 
There are 
There are 



10: 14 
39: 65 
43: 65 



40: 62 



27: 
40: 



74 

65 



44: 62 



44 : 65 



47: 65 



48: 35 



48: 74 



8 hits at basef 62 
8 hits at base# 65 

3 hits at base# 14 
3 hits at base# 74 
1 hits at base# 26 
1 hits at base# 35 
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37: 62 
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45: 
49: 



65 
62 
74 



-"- Gcgg 11 

8: 91 9: 16 10: 16 11: 16 37: 67 39: 67 

40: 67 42: 67 43: 67 45: 67 46: 67 

There are 7 hits at base! 67 

5 There are 3 hits at base# 16 

There are 1 hits at base# 91 

BsiHKAI GWGCWc 2 0 

2: 30 4: 30 6: 30 7: 30 9: 30 10: 30 

10 12: 89 13: 89 14: 89 37: 51 38: 51 39: 51 

40: 51 41: 51 42: 51 43: 51 44: 51 45: 51 
46: 51 47: 51 

There are 11 hits at base# 51 

15 Bspl286I GDGCHc 20 

2: 30 4: 30 6: 30 7: 30 9: 30 10: 30 

12: 89 13: 89 14: 89 37: 51 38: 51 39: 51 

40: 51 41: 51 42: 51 43: 51 44: 51 45: 51 
46: 51 47: 51 

I -j 20 There are 11 hits at base# 51 



HgiAI GWGCWc 20 r 

2: 30 4: 30 6: 30 7: 30 9: 30 10: 30 

12: 89 13: 89 14: 89 37: 51 38: 51 39: 51 

25 40: 51 41: 51 42: 51 43: 51 44: 51 45: 51 
46: 51 47: 51 

There are 11 hits at base# 51 

BsoFI GCngc 2 6 

30 2: 53 3: 53 5: 53 6: 53 7: 53 8: 53 

8: 91 9: 53 10: 53 11: 53 31: 53 36: 36 

37: 64 39: 64 40: 64 41: 64 42: 64 43: 64 

44: 64 45: 64 46: 64 47: 64 48: 53 49: 53 
50: 45 51: 53 

35 There are 13 hits at base# 53 

There are 10 hits at base# 64 



Tsel Gcwgc 17 
2: 53 3: 53 5: 53 6: 53 



7: 53 



8: 53 



9: 53 10: 53 11: 53 31: 53 
46: 64 48: 53 49: 53 50: 45 
There are 13 hits at base# 53 



36: 36 
51: 53 



45: 64 
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There are 31 hits at base# 67 



HpyCH4V TGca 34 
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90 
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44 


51: 


52 











There are 21 hits at base# 44 

There are 1 hits at base# 52 



AccI GTmkac 13 5-base recognition 

7: 37 11: 24 37: 16 38: 16 39: 16 40: 16 
41: 16 42: 16 43: 16 44: 16 45: 16 46: 16 
47: 16 

There are 11 hits at base# 16 



Sad I CCGCgg 8 6-base recognition 

9: 14 10: 14 11: 14 37: 65 39: 65 40: 65 
42: 65 43: 65 

There are 5 hits at base# 65 
There are 3 hits at base# 14 



Tfil Gawtc 24 

9: 22 15: 2 16: 2 17: 2 18: 2 19: 2 



26: 2 27: 2 28: 2 29: 2 
32: 2 33: 2 33: 22 34: 22 
There are 20 hits at base# 2 



30: 2 
35: 2 



31: 2 
36: 2 



5 BsrnAT Nnnnnngagac 19 
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36: 


11 
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87 



48: 87 

10 There are 16 hits at basef 11 



C3 



20: 


12 


21: 


12 


26: 


12 


27: 


12 


34: 


12 


35: 


12 



Bpml ctccag 19 
15: 12 16: 12 17: 12 18: 12 
22: 12 23: 12 24: 12 25: 12 
15 28: 12 30: 12 31: 12 32: 12 
36: 12 

There are 19 hits at base! 12 



XmnI GAANNnnttc 12 
20 37: 30 38: 30 39: 30 40: 30 41: 30 42: 30 
43: 30 44: 30 45: 30 46: 30 47: 30 50: 30 
There are 12 hits at base# 30 

BsrI NCcagt 12 
25 37: 32 38: 32 39: 32 40: 32 41: 32 42: 32 
43: 32 44: 32 45: 32 46: 32 47: 32 50: 32 
There are 12 hits at base# 32 

Banll GRGCYc 11 
30 37: 51 38: 51 39: 51 40: 51 41: 51 42: 51 
43: 51 44: 51 45: 51 46: 51 47: 51 
There are 11 hits at base# 51 

Ecll36I GAGctc 11 
35 37: 51 38: 51 39: 51 40: 51 41: 51 42: 51 
43: 51 44: 51 45: 51 46: 51 47: 51 
There are 11 hits at base# 51 



Sad GAGCTc 



11 



37: 51 38: 51 39: 51 40: 
43: 51 44: 51 45: 51 46: 
There are 11 hits at base# 51 



Table 206: Synthetic 3-23 FR3 of human heavy chains showning positions of possible cleavage sites 



10 



15 



20 



25 



ia 30 



m 35 



i 40 



45 



23) 



Sites engineered into the synthetic gene are shown in upper case DNA 
with the RE name between vertical bars {as in | Xbal I ) . 
RERSs frequently found in GLGs are shown below the synthetic sequence 
with the name to the right (as in gtn ac=MaeIII (24 ) , indicating that 
24 of the 51 GLGs contain the site) . 

j FR3 

8 9 90 (codon # xn 

R F synthetxc 3 
(cgc I ttc | 6 

Allowed DNA ]cgn|tty| 

I agr | 

ga ntc = Hinfl (38) 
ga gtc = Plel (18) 
ga wtc = Tfil (20) 

gtn ac = Maelll (24) 
gts ac = Tsp45I (21) 
tc acc = HphI (44) 

FR3 

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 
TISRDNSKNTLYLQM 
I act t atclTCT |AGA| gac I aac | tct I aag| aat | act | ctcltac I ttgl cagl atgl 51 
allowed I acn | ath I ten I cgn [ gay I aay I ten | aar | aay I acn I ttr | tay | ttr | car I atgl 
lagylagri lagyl | ctn I I ctn I 

I galgac = BsroAI(16) ag ct = Alul (23) 

cltcc ag =- Bpml[19) g ctn age = BlpI (21) 

| | g aan nnn ttc = Xmnl(12) 

| Xbal | tgca= HpyCH4V(21) 

FR3 >i 

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
NSLRAEDTAVYYCAK 
|aac|agC|TTA|AGg| get Igagf gac | aCT | GCA| Gtc | tac I tat | tgc I get I aaa I 96 



! allowed | aay I ten | ttr I cgn| 


gen 


gar | gay I acn I gen t gtn I tay | tay | tgy | gen | aar 1 


! | agyl ctnl agr I 




t I 


! 1 I 


cc 


nng g = BsaJI (23) «ic ngt = Bst4cl (51) 


! | aga 


tct 


= Bglll(lO) I ac ngt = HpyCH4III (51) 


! I Rga 


tcY 


= BstYI (11) J ac ngt = Taal (51) 






c ayn nnn rtc = MslI (44) 






eg rye g = BsiEI (23) 






yg gee r - Eael (23) 






eg gee g ~ EagI (23) 






|g gec = HaeIII(25) 






gag g = Mnll (31) I 


! lAflll 1 




I PStI | 



Tabic 217: Human HC GLG FR1 Sequences 

VH Exon - Nucleotide sequence alignment 

VHl 

1-02 CAG GTG CAG CTG GTG CAG TCT GGG GCT GAG GTG AAG AAG CCT GGG GCC TCA GTG AAG 

GTC TCC TGC AAG GCT TCT GGA TAC ACC TTC ACC 
1-03 cag gtC cag ctT gtg cag tct ggg get gag gtg aag aag cct ggg gec tea gtg aag 

gtT tec tgc aag get tct gga tac acc ttc acT 
1-08 cag gtg cag ctg gtg cag tct ggg get gag gtg aag aag cct ggg gec tea gtg aag 

gtc tec tgc aag get tct gga tac acc ttc acc 
1-18 cag gtT cag ctg gtg cag tct ggA get gag gtg aag aag cct ggg gec tea gtg aag 

gtc tec tgc aag get tct ggT tac acc ttT acc 
1-24 cag gtc cag ctg gtA cag tct ggg get gag gtg aag aag cct ggg gec tea gtg aag 

gtc tec tgc aag gTt tcC gga tac acc etc acT 
1-45 cag Atg cag ctg gtg cag tct ggg get gag gtg aag aag Act ggg Tec tea gtg aag 

gtT tec tgc aag get tcC gga tac acc ttc acc 
1-4 6 cag gtg cag ctg gtg cag tct ggg get gag gtg aag aag cct ggg gec tea gtg aag 

gtT tec tgc aag gcA tct gga tac acc ttc acc 
1-58 caA Atg cag ctg gtg cag tct ggg Cct gag gtg aag aag cct ggg Acc tea gtg aag 

gtc tec tgc aag get tct gga tTc acc ttT acT 
1-69 cag gtg cag ctg gtg cag tct ggg get gag gtg aag aag cct ggg Tec tcG gtg aag 

gtc tec tgc aag get tct gga GGc acc ttc aGc 
1-e cag gtg cag ctg gtg cag tct ggg get gag gtg aag aag cct ggg Tec tcG gtg aag 

gtc tec tgc aag get tct gga GGc acc ttc aGc 

1- f Gag gtC cag ctg gtA cag tct ggg get gag gtg aag aag cct ggg gcT Aca gtg aaA 

Ate tec tgc aag gTt tct gga tac acc ttc acc 

VH2 

2- 05 CAG ATC ACC TTG AAG GAG TCT GGT CCT ACG CTG GTG AAA CCC ACA CAG ACC CTC ACG 

CTG ACC TGC ACC TTC TCT GGG TTC TCA CTC AGC 
2-2 6 cag Gtc acc ttg aag gag tct ggt cct GTg ctg gtg aaa ccc aca Gag acc etc acg 
ctg acc tgc acc Gtc tct ggg ttc tea etc age 

2- 70 cag Gtc acc ttg aag gag tct ggt cct Gcg ctg gtg aaa ccc aca cag acc etc acA 

ctg acc tgc acc ttc tct ggg ttc tea etc age 

VH3 

3- 07 GAG GTG CAG CTG GTG GAG TCT GGG GGA GGC TTG GTC CAG CCT GGG GGG TCC CTG AGA 

CTC TCC TGT GCA GCC TCT GGA TTC ACC TTT AGT 
3-09 gaA gtg cag ctg gtg gag tct ggg gga ggc ttg gtA cag cct ggC Agg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttt GAt 
3-11 Cag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc Aag cct ggA ggg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttc agt 
3-13 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtA cag cct ggg ggg tec ctg aga 

etc tec tgt gca gee tct gga ttc acc ttC agt 
3-15 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtA Aag cct ggg ggg tec ctT aga 

etc tec tgt gca gec tct gga ttc acT ttC agt 
3-20 gag gtg cag ctg gtg gag tct ggg gga ggT Gtg gtA cGg cct ggg ggg tec ctg aga 



etc tec tgt gca gec tct gga ttc acc ttt GAt 
3-21 gag gtg cag ctg gtg gag tct ggg gga ggc Ctg gtc Aag cct ggg ggg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttc agt 
3-23 gag gtg cag ctg Ttg gag tct ggg gga ggc ttg gtA cag cct ggg ggg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttt agC 
3-30 Cag gtg cag ctg gtg gag tct ggg gga ggc Gtg gtc cag cct ggg Agg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttC agt 
3-30.3 Cag gtg cag ctg gtg gag tct ggg gga ggc Gtg gtc cag cct ggg Agg tec ctg aga 

etc tec tgt gca gee tct gga ttc acc ttc agt 
3-30.5 Cag gtg cag ctg gtg gag tct ggg gga ggc Gtg gtc cag cct ggg Agg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttC agt 
3-33 Cag gtg cag ctg gtg gag tct ggg gga ggc Gtg gtc cag cct ggg Agg tec ctg aga 

etc tec tgt gca gcG tct gga ttc acc ttC agt 
3-43 gaA gtg cag ctg gtg gag tct ggg gga gTc Gtg gtA cag cct ggg ggg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttt GAt 
3-48 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtA cag cct ggg ggg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttc agt 
3-49 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtA cag ccA ggg Cgg tec ctg aga 

etc tec tgt Aca gcT tct gga ttc acc ttt Ggt 
3-53 gag gtg cag ctg gtg gag Act ggA gga ggc ttg Ate cag cct ggg ggg tec ctg aga 

etc tec tgt gca gec tct ggG ttc acc GtC agt 
3-64 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc cag cct ggg ggg tec ctg aga 

etc tec tgt gca gee tct gga ttc acc ttc agt 
3-66 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc cag cct ggg ggg tec ctg aga 

etc tec tgt gca gee tct gga ttc acc GtC agt 
3-72 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc cag cct ggA ggg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttC agt 
3-73 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc cag cct ggg ggg tec ctg aAa 

etc tec tgt gca gec tct ggG ttc acc ttC agt 
3-74 gag gtg cag ctg gtg gag tcC ggg gga ggc ttA gtT cag cct ggg ggg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc ttC agt 

3- d gag gtg cag ctg gtg gag tct Cgg gga gTc ttg gtA cag cct ggg ggg tec ctg aga 

etc tec tgt gca gec tct gga ttc acc GtC agt 

VH4 

4- 04 CAG GTG CAG CTG CAG GAG TCG GGC CCA GGA CTG GTG AAG CCT TCG GGG ACC CTG TCC 

CTC ACC TGC GCT GTC TCT GGT GGC TCC ATC AGC 
4-28 cag gtg cag ctg cag gag teg ggc cca gga ctg gtg aag cct teg gAC acc ctg tec 

etc acc tgc get gtc tct ggt TAc tec ate age 
4-30.1 cag gtg cag ctg cag gag teg ggc cca gga ctg gtg aag cct tcA CAg acc ctg tec 

etc acc tgc Act gtc tct ggt ggc tec ate age 
4-30.2 cag Ctg cag ctg cag gag tcC ggc Tea gga ctg gtg aag cct tcA CAg acc ctg tec 

etc acc tgc get gtc tct ggt ggc tec ate age 
4-30.4 cag gtg cag ctg cag gag teg ggc cca gga ctg gtg aag cct tcA CAg acc ctg tec 

etc acc tgc Act gtc tct ggt ggc tec ate age 



4—31 




gtg 


cag 


ctg 


cag 


y«*y 


teg 


aac 


cca 


acta 


ctg 


gtg 


aag 


cct 


tcA CAg 


acc 


ctg 


tec 




etc 


ace 


tac 


Act 


ate 


tct 


aat 


ggc 


tec 


ate 


age 


















4-34 




y *-y 


cag 


ctA 


cag 


Cag 


tGg 


aac 


Gca 




ctg 


Ttg 


aag 


cct 


teg 


gAg 


acc 


ctg 


tec 




etc 


acc 


tgc 


act 


ate 
y 


tAt 


aat 




tec 


Ttc 


agT 


















4—39 


cag 


Cta 


cag 


c tg 


cag 


y a \t 


teg 


ggc 


cca 


gga 


ctg 


gtg 


aag 


cct 


teg 


aAa 
y^*y 


acc 


ctg 


tec 




etc 


acc 


tgc 


Act 


gtc 


tct 


yy u 


rrrrc 


tec 


ate 


age 


















A CO 


cag 


gtg 


cag 


ctg 


cag 


gag 


teg 


9"9* c 




gga 


ctg 


gtg 


aag 


cct 


teg 


y^y 


acc 


ctg 


tec 




etc 


acc 


cyc 


Act 


gtc 




ggt 


ggc 


tec 


ate 


agT 


















'i DJ. 


cag 


gtg 


cag 




cag 


y d y 


teg 


yy*- 


cca 


aaa 
yy° 


ctg 


gtg 


aag 


cct 


teg 


crAcr 
y"y 


acc 


ctg 


tec 








tgc 


Act 


gtc 


tct 


Cf Cft 




tec 


Gtc 


age 


















A V* 


cag 


gtg 


cag 


ctg 


cag 


9* a 9* 


teg 


ggc 




gga 


ctg 


gtg 


aag 


cct 


tan 


gAg 


acc 


ctg 


tec 




etc 


acc 


tgc 


get 


gtc 


tct 


ggt 


TAc 




ate 


age 


















VHS 








































5 — 5 1 


GAG 


GTG 


CAG 


CTG 






TCT 


GGA 


GCA 


GAG 


GTG 


AAA AAG 


CCC 


GGG 


GAG 


TCT 


CTG 


AAG 




ATC 


TCC 




AAv» 


LrLri. 






Tar* 


Aw* 




ACC 


















5-a 


gaA 


gtg 


cag 


ctg 




cag 


tct 


gga 


gca 


gag 


gtg 


aaa 


aag 


ccc 


ggg 


gag 


tct 


ctg 


aGg 




ate 


tec 


tgt 


aag 


ggt 


tct 


gga 


tac 


age 


ttt 


acc 


















VH6 








































6-1 


CAG 


GTA 


CAG 


CTG 


CAG 


CAG 


TCA 


GGT 


CCA 


GGA 


CTG 


GTG 


AAG 


CCC 


TCG 


CAG 


ACC 


CTC 


TCA 




CTC 


ACC 


TGT 


GCC 


ATC 


TCC 


GGG 


GAC 


AGT 


GTC 


TCT 


















VH7 








































7-4.1 


CAG 


GTG 


CAG 


CTG 


GTG 


CAA 


TCT 


GGG 


TCT 


GAG 


TTG 


AAG 


AAG 


CCT 


GGG 


GCC 


TCA 


GTG 


AAG 




GTT 


TCC 


TGC 


AAG 


GCT 


TCT 


GGA 


TAC 


ACC 


TTC 


ACT 



















Table 220: RERS sites in Human HC GLG FRls where there are at least 20 GLGs cut 



10 



w ■ 

w s- 

30 



Bsgl 


GTGCAG 










71 




16/14 


bases 1 


1: 


4 


1: 


13 


2: 


13 


3: 


4 


3: 


13 


4: 13 


6: 


13 


7: 


4 


7: 


13 


8: 


13 


9: 


4 


9: 13 


10: 


4 


10: 


13 


15: 


4 


15: 


65 


16: 


4 


16: 65 


17: 


4 


17: 


65 


18: 


4 


18: 


65 


19: 


4 


19: 65 


20: 


4 


20: 


65 


21: 


4 


21: 


65 


22: 


4 


22: 65 


23: 


4 


23: 


65 


24: 


4 


24 : 


65 


25: 


4 


25: 65 


26: 


4 


26: 


65 


27: 


4 


27: 


65 


28: 


4 


28: 65 


29: 


4 


30: 


4 


30: 


65 


31: 


4 


31: 


65 


32: 4 


32: 


65 


33: 


4 


33: 


65 


34: 


4 


34: 


65 


35: 4 


35: 


65 


36: 


4 


36: 


65 


37 : 


4 


38: 


4 


39: 4 


41: 


4 


42: 


4 


43: 


4 


45: 


4 


46: 


4 


47: 4 


48: 


4, 


48: 


13 


49: 


4 


49: 


13 


51: 


4 





15 There are 39 hits at base! 4 

There are 21 hits at base# 65 



ctgeae 



12: 


63 


13: 


63 


14: 


63 


39: 


63 


41: 


63 


42: 


63 


44: 


63 


45: 


63 


46: 


63 














Bbvl 


GCAGC 










65 










1: 


6 


3: 


6 


6: 


6 


7: 


6 


8: 


6 


9: 


6 


10: 


6 


15: 


6 


15: 


67 


16: 


6 


16: 


67 


17: 


6 


17: 


67 


18: 


6 


18: 


67 


19: 


6 


19: 


67 


20: 


6 


20: 


67 


21: 


6 


21: 


67 


22: 


6 


22: 


67 


23: 


6 


23: 


67 


24: 


6 


24: 


67 


25: 


6 


25: 


67 


26: 


6 


26: 


67 


27: 


6 


27: 


67 


28: 


6 


28: 


67 


29: 


6 


30: 


6 


30: 


67 


31: 


6 


31: 


67 


32: 


6 


32: 


67 


33: 


6 


33: 


67 


34: 


6 


34: 


67 


35: 


6 


35: 


67 


36: 


6 


36: 


67 


37: 


6 


38: 


6 


39: 


6 


40: 


6 


41: 


6 


42: 


6 


43: 


6 


44: 


6 


45: 


6 


46: 


6 


47: 


6 


48: 


6 


49: 


6 


50: 


12 


51: 


6 







There are 43 hits at basef 6 Bolded sites very near sites 

listed below 

35 There are 21 hits at base# 67 

-"- getge 13 
37: 9 38: 9 39: 9 40: 3 40: 9 41: 9 
42: 9 44: 3 44: 9 45: 9 46: 9 47: 9 



50: 9 

Thexe are 11 hits at basett 9 



BsoFI 


GCngc 








78 










1: 


6 


3: 


6 


6: 


6 


7: 


6 


8: 


6 


9: 


6 


10: 


6 


15: 


6 


15: 


67 


16: 


6 


16: 


67 


17: 


6 


17: 


67 


18: 


6 


18: 


67 


19: 


6 


19: 


67 


20: 


6 


20: 


67 


21: 


6 


21: 


67 


22: 


6 


22 : 


67 


23: 


6 


23: 


67 


24: 


6 


24: 


67 


25: 


6 


25: 


67 


26: 


6 


26: 


67 


27: 


6 


27: 


67 


28: 


6 


28: 


67 


29: 


6 


30: 


6 


30: 


67 


31: 


6 


31: 


67 


32: 


6 


32: 


67 


33: 


6 


33: 


67 


34: 


6 


34: 


67 


35: 


6 


35: 


67 


36: 


6 


36: 


67 


37: 


6 


37: 


9 


38: 


6 


38: 


9 


39: 


6 


39: 


9 


40: 


3 


40: 


6 


40: 


9 


41: 


6 


41: 


9 


42: 


6 


42: 


9 


43: 


6 


44: 


3 


44: 


6 


44: 


9 


45: 


6 


45: 


9 


46: 


6 


46: 


9 


47: 


6 


47: 


9 


48: 


6 


49: 


6 


50: 


9 


50: 


12 


51: 


6 


There 


are 43 


hits 


at 


basej 


6 


These often 


occur 


togei 


Thexe 


axe 11 


hits 


at 


base# 9 












There 


are 2 


hits 


at 


base# 3 












There 


are 21 


hits 


at 


base# 67 













Tsel Gcwgc 78 



1: 


6 


3: 


6 


6: 


6 


7: 


6 


8: 


6 


9: 


6 


10: 


6 


15: 


6 


15: 


67 


16: 


6 


16: 


67 


17: 


6 


17: 


67 


18: 


6 


18: 


67 


19: 


6 


19: 


67 


20: 


6 


20: 


67 


21: 


6 


21: 


67 


22: 


6 


22: 


67 


23: 


6 


23: 


67 


24: 


6 


24: 


67 


25: 


6 


25: 


67 


26: 


6 


26: 


67 


27: 


6 


27: 


67 


28: 


6 


28: 


67 


29: 


6 


30: 


6 


30: 


67 


31: 


6 


31: 


67 


32: 


6 


32: 


67 


33: 


6 


33: 


67 


34: 


6 


34: 


67 


35: 


6 


35: 


67 


36: 


6 


36: 


67 


37: 


6 


37: 


9 


38: 


6 


38: 


9 


39: 


6 


39: 


9 


40: 


3 


40: 


6 


40: 


9 


41: 


6 


41: 


9 


42: 


6 


42: 


9 


43: 


6 


44: 


3 


44: 


6 


44: 


9 


45: 


6 


45: 


9 


46: 


6 


46: 


9 


47: 


6 


47: 


9 


48: 


6 


49: 


6 


50: 


9 


50: 


12 


51: 


6 



There are 43 hits at base# S Often together. 
Thexe axe 11 hits at base# 9 



There are 2 hits at base# 3 
There are 1 hits at base! 12 

There are 21 hits at base# 67 



MspAlI CMGckg 



48 



There are 
There are 



46 hits at basejf 7 
2 hits at base# 1 







1: 


7 


3: 


7 


4: 


7 


5: 


7 


6 : 


7 






8: 


7 


9: 


7 


10: 


7 


11: 


7 


15: 


7 






17: 


7 


18 : 


7 


19: 


7 


20: 


7 


21 : 


7 






23: 


7 


24 : 


7 


25: 


7 


26: 


7 


27 : 


7 




10 


29: 


7 


30: 


7 


31: 


7 


32: 


7 


33 : 


7 






35: 


7 


36: 


7 


37: 


7 


38: 


7 


39: 


7 






40: 


7 


41: 


7 


42: 


7 


44: 


-i 
1 


A A • 


/ 






46: 


7 


47: 


7 


48: 


7 


49: 


7 


50 : 


7 






There ai 


:e 46 


hits at 


base# 7 
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Analysis repeated using only 8 best REdaptors 

5 Id Ntot 01234567 8 + 

1 301 78 101 54 32 16 9 10 1 0 281 102#1 ccgtgtattactgtgcgagaga 

2 493 69 155 125 73 37 14 11 3 6 459 103#2 ctgtgtattactgtgcgagaga 

3 189 52 45 38 23 18 5 4 1 3 17 6 108#3 ccgtgtattactgtgcgagagg 

4 127 29 23 28 24 10 6 5 2 0 114 323#22 ccgtatattactgtgcgaaaga 
10 5 78 21 25 14 11 1 4 2 0 0 72 330#23 ctgtgtattactgtgcgaaaga 

6 79 15 17 25 8 11 1 2 0 0 76 439#44 ctgtgtattactgtgcgagaca 

7 43 14 15 5 5 3 0 1 0 0 42 551#48 ccatgtattactgtgcgagaca 

8 307 26 63 72 51 38 24 14 13 6 250 5a#49 ccatgtattactgtgcgaga 
1 102#1 ccgtgtattactgtgcgagaga ccgtgtattactgtgcgagaga 

15 2 103#2 ctgtgtattactgtgcgagaga .t 

3 108#3 ccgtgtattactgtgcgagagg g 

4 323#22 ccgtatattactgtgcgaaaga ....a a... 

5 330#23 ctgtgtattactgtgcgaaaga .t a.., 

6 439#44 ctgtgtattactgtgcgagaca . t c. 

20 7 551#48 ccatgtattactgtgcgagaca ..a c. 

8 5a#49 ccatgtattactgtgcgagaAA . .a AA 

Seqs with the expected RE site only 14 63 / 1617 

Seqs with only an unexpected site 0 

25 Seqs with both expected and unexpected. ... 7 

Seqs with no sites 0 



C3 



Table 300: Kappa FRl GLGs 

1123456789 10 11 12 

GAC ATC CAG ATG ACC GAG TCT CCA TCC TCC CTG TCT 
I 13 14 15 16 17 18 19 20 21 22 23 
5 GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC I Ol2 
GAC ATC CAG ATG ACC CAG TCT CCA TCC TCC CTG TCT 
GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC ! 02 
GAC ATC CAG ATG ACC CAG TCT CCA TCC TCC CTG TCT 
GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC ! 018 

JO GAC ATC CAG ATG ACC CAG TCT CCA TCC TCC CTG TCT 

GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC ! OS 
GAC ATC CAG ATG ACC CAG TCT CCA TCC TCC CTG TCT 
GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC ! A20 
GAC ATC CAG ATG ACC CAG TCT CCA TCC TCC CTG TCT 

15 GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC ! A30 
AAC ATC CAG ATG ACC CAG TCT CCA TCT GCC ATG TCT 
GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGT ! L14 
GAC ATC CAG ATG ACC CAG TCT CCA TCC TCA CTG TCT 
GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGT ! LI 

20 ' GAC ATC CAG ATG ACC CAG TCT CCA TCC TCA CTG TCT 

GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGT 1 LI 5 

GCC ATC CAG TTG ACC CAG TCT CCA TCC TCC CTG TCT 
GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC ! L4 
GCC ATC CAG TTG ACC CAG TCT CCA TCC TCC CTG TCT 

25 GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC ! LI 8 

GAC ATC CAG ATG ACC CAG TCT CCA TCT TCC GTG TCT 
GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGT t L5 
GAC ATC CAG ATG ACC CAG TCT CCA TCT TCT GTG TCT 
GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGT ! L19 

30 GAC ATC CAG TTG ACC CAG TCT CCA TCC TTC CTG TCT 

GCA TCT GTA GGA GAC AGA GTC ACC ATC ACT TGC ! L8 
GCC ATC CGG ATG ACC CAG TCT CCA TTC TCC CTG TCT 
GCA TCT" GTA GGA GAC AGA GTC ACC ATC ACT TGC ! L23 
GCC ATC CGG ATG ACC CAG TCT CCA TCC TCA TTC TCT 

35 GCA TCT ACA GGA GAC AGA GTC ACC ATC ACT TGT ! L9 
GTC ATC TGG ATG ACC CAG TCT CCA TCC TTA CTC TCT 
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GCC 
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TCT 





TTG TCT CCA 
GAC ATC GTG 
GTG TCT CTG 
GAA ACG ACA 
GCG ACT CCA 
GAA ATT GTG 
GTG ACT CCA 
GAA ATT GTG 
GTG ACT CCA 
GAT GTT GTG 
GTG ACT CCA 



GGG GAA AGA 
ATG ACC CAG 
GGC GAG AGG 
CTC ACG CAG 
GGA GAC AAA 
CTG ACT CAG 
AAG GAG AAA 
CTG ACT CAG 
AAG GAG AAA 
ATG ACA CAG 
GGG GAG AAA 



GCC ACC CTC 
TCT CCA GAC 
GCC ACC ATC 
TCT CCA GCA 
GTC AAC ATC 
TCT CCA GAC 
GTC ACC ATC 
TCT CCA GAC 
GTC ACC ATC 
TCT CCA GCT 
GTC ACC ATC 



TCC TGC I 
TCC CTG GCT 
AAC TGC ! 
TTC ATG TCA 
TCC TGC ! 
TTT CAG TCT 
ACC TGC ! 
TTT CAG TCT 
ACC TGC I 
TTC CTC TCT 
ACC TGC ! 
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TCC 


GCG 


TCC GGG 


M- 


75 




TCT 


CCT 


GGA 


CAG 


TCA 


GTC 


ACC 


ATC 


TCC 


TGC 


! 2c 


i 1 

%y 






cag 


tct 


gcc 


ctg 


act 


cag 


cct 


cGc 


tcA 


gTg 


tec ggg 






tct 


cet 


gqa 


cag 


tea 


gtc 


acc 


ate 


tec 


tgc! 


2e 








cag 


tct 


gcc 


ctg 


act 


cag 


cct 


Gcc 


tec 


gTg 


tcT ggg 








tct 


cct 


gga 


cag 


tcG 


Ate 


acc 


ate 


tec 


tgc 


! 2a2 
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cag 


tct 


gcc 


ctg 


act 


cag 


cct 


CCC 


tec 


gTg 


tec ggg 








tct 


cct 


gga 


cag 


tea 


gtc 


acc 


ate 


tec 


tgc 


! 2d 


S3 






cag 


tct 


gcc 


ctg 


act 


cag 


cct 


Gcc 


tec 


gTg 


tcT ggg 








tct 


cct 


gga 


cag 


tcG 


Ate 
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ate 


tec 


tgc 


! 2b2 


V 3 




! VL3 


























25 




TCC 


TAT 


GAG 


CTG 


ACT 


CAG 


CCA 


CCC 


TCA 


GTG 


TCC GTG 








TCC 


CCA 


GGA 


CAG 


ACA 


GCC 


AGC 


ATC 


ACC 


TGC . 


3r 








tec 


tat 


gag 


ctg 


act 


cag 


cca 


cTc 


tea 


gtg 


tcA gtg 








Gcc 


cTG 


gga 


cag 


acG 


gcc 


agG 


atT 


acc 


tgT 


! 3j 








tec 


tat 


gag 


ctg 


acA 


cag 


cca 


ccc 


tcG 


gtg 


tcA gtg 




30 




tec 


cca 


gga 


caA 


acG 


gcc 


agG 


ate 


acc 


tgc 


► 3p 








tec 


tat 


gag 


ctg 


acA 


cag 


cca 


ccc 


tcG 


gtg 


tcA gtg 








tec 


cTa 


gga 


cag 


aTG 


gcc 


agG 


ate 


acc 


tgc 


! 3a 








tcT 


tct 


gag 


ctg 


act 


cag 


GAC 


ccT 


GcT 


gtg 


tcT gtg 








Gcc 


TTG 


gga 


cag 


aca 


gTc 


agG 


ate 


acA 


tgc 


1 31 









tec 


tat 


gTg 


ctg 


act 


cag 


cca 


CCC 


tea 


gtg 


tcA gtg 








Gcc 


cca 


gga 


Aag 


acG 


gcc' 


agG 


atT 


acc 


tgT 


! 3h 








tec 


tat 


gag 


ctg 


acA 


cag 


cTa 


ccc 


tcG 


gtg 


tcA gtg 








tec 


cca 


gga 


cag 


aca 


gcc 


agG 


ate 


acc 


tgc 


! 3e 




5 




tec 


tat 


gag 


ctg 


aTG 


cag 


cca 


ccc 


tcG 


gtg 


tcA gtg 
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cca 


gga 


cag 


acG 


gcc 


agG 


ate 


acc 


tgc 


! 3m 








tec 


tat 


gag 


ctg 


acA 


cag 


cca 


Tec 


tea 


gtg 


tcA gtg 








tcT 


ccG 


gga 


cag 


aca 


gcc 


agG 


ate 


acc 


tgc 


! V2-1! 






! VL4 


























JO 




CTG 


CCT 


GTG 


CTG 


ACT 


CAG 


CCC 


CCG 


TCT 


GCA 


TCT GCC 








TTG 


CTG 


GGA 


GCC 


TCG 


ATC 


AAG 


CTC 


ACC 


TGC 


! 4c 








cAg 


cct 


gtg 


ctg 


act 


caA 


TcA 


TcC 


tct 


gcC 


tct gcT 








tec 


ctg 


gga 


Tec 


teg 


Gtc 


aag 


etc 


acc 


tgc 


! 4a 








cAg 


cTt 


gtg 


ctg 


act 


caA 


TcG 


ccC 


tct 


gcC 


tct gcc 




15 




tec 


ctg 


gga 


gcc 


teg 


Gtc 


aag 


etc 


acc 


tgc 


I 4b 




! VL5 
























fn 
III 






CAG 


CCT 


GTG 


CTG 


ACT 


CAG 


CCA 


CCT 


TCC 


TCC 


TCC GCA 






TCT 


CCT 


GGA 


GAA 


TCC 


GCC 


AGA 


CTC 


ACC 


TGC 


! 5e 








cag 


Get 


gtg 


ctg 


act 


cag 


ccG 


Get 


tec 


CTc 


tcT gca 


p. 


20 




tct 


cct 


gga 


gCa 


tcA 


gcc 


agT 


etc 


acc 


tgc 


! 5c 








cag 


cct 


gtg 


ctg 


act 


cag 


cca 


Tct 


tec 


CAT 


tcT gca 


Q 






tct 


Tct 


gga 


gCa 


tcA 


gTc 


aga 


etc 


acc 


tgc 


! 5b 






I VL6 






























AAT 


TTT 


ATG 


CTG 


ACT 


CAG 


CCC 


CAC 


TCT 


GTG 


TCG GAG 




25 




TCT 


CCG 


GGG 


AAG 


ACG 


GTA 


ACC 


ATC 


TCC 


TGC 


• 6a 






! VL7 






























CAG 


ACT 


GTG 


GTG 


ACT 


CAG 


GAG 


CCC 


TCA 


CTG 


ACT GTG 








TCC 


CCA 


GGA 


GGG 


ACA 


GTC 


ACT 


CTC 


ACC 


TGT 


I 7a 








cag 


Get 


gtg 


gtg 


act 


cag 


gag 


ccc 


tea 


ctg 


act gtg 




30 


! VL8 


tec 


cca 


gga 


ggg 


aca 


gtc 


act 


etc 


acc 


tgt 


• 7b 








CAG 


ACT 


GTG 


GTG 


ACC 


CAG 


GAG 


CCA 


TCG 


TTC 


TCA GTG 








TCC 


CCT 


GGA 


GGG 


ACA 


GTC 


ACA 


CTC 


ACT 


TGT 


! 8a 



• VL9 



CAG CCT GTG 

TCC CTG GGA 

! VL10 

5 CAG GCA GGG 

GGC TTG AGA 



CTG ACT CAG 
GCC TCG GTC 

CTG ACT CAG 
CAG ACC GCC 



CCA CCT TCT 

ACA CTC ACC 

CCA CCC TCG 

ACA CTC ACC 



GCA TCA GCC 

TGC ! 9a 

GTG TCC AAG 

TGC ! 10a 





10 



Table 405 RERSs found in human lambda FR1 GLGs 
! There are 31 lambda GLGs 
Mlyl NnnnnnGACTC 



25 



1 
9 
20 
25 
31 



6 
6 
6 
6 
6 



3: 


6 


4 : 


6 


6: 


6 


7: 


6 


8: 


6 


10: 


6 


11: 


6 


12: 


6 


15: 


6 


16: 


6 


21: 


6 


22 : 


6 


23: 


6 


23: 


50 


24: 


6 


25: 


50 


26: 


6 


27: 


6 


28 : 


6 


30: 


6 



There are 23 hits at base# 



26: 



GAGTCNNNNNn 
34 



C3 



/5 



25 



Mwol GCNNNNNnngc 



20 



1: 


9 


2: 


9 


3: 


9 


4: 


9 


11: 


9 


11: 


56 


12: 


9 


13: 


9 


14 : 


9 


16: 


9 


17: 


9 


18: 


9 


19: 


9 


20: 


9 


23: 


9 


24: 


9 


25: 


9 


26; 


9 


30: 


9 


31: 


9 



















There are 19 hits at 
20 Hinfl Gantc 



base# 



27 



There are 
Plel gactc 



23 hits at base# 12 



25 



31: 12 
There are 



23 hits at base# 12 



1: 


12 


3: 


12 


4 : 


12 


6: 


12 


7 : 


12 


8: 


12 


9: 


12 


10: 


12 


11: 


12 


12: 


12 


15 : 


12 


16: 


12 


20: 


12 


21: 


12 


22 : 


12 


23: 


12 


23: 


46 


23: 


56 


24: 


12 


25: 


12 


25 : 


56 


26: 


12 


26: 


34 


27: 


12 


28: 


12 


30: 


12 


31: 


12 

















1 : 


12 


3 : 


12 


4 : 


12 


6: 


12 


7: 


12 


8: 


12 




9: 


12 


10: 


12 


11: 


12 


12: 


12 


15 : 


12 


16: 


12 


30 


20: 


12 


21: 


12 


22: 


12 


23: 


12 


23: 


56 


24 : 


12 




25 : 


12 


25: 


56 


26: 


12 


27 : 


12 


28 : 


12 


30 : 


12 



35 -"- gagtc 
26: 34 



fft 



i : 4 



10 



15 



20 



25 



SO 



Ddel Ctnag 

1: 14 

5: 24 
10: 14 
15: 14 
24; 14 
30: 14 
There are 



2: 24 

6: 14 

11: 14 

16: 14 

25: 14 

31: 14 

21 hits at base# 14 



3: 
7: 
11: 
16: 



14 
14 
24 
24 



26: 14 



3: 
7; 
12: 
19: 
27 : 



32 
24 
24 
14 
24 
14 



4: 14 

8 : 14 

12: 24 

20: 14 

28: 14 



4 : 
9: 
15 : 



24 
14 

5 



23: 14 
29: 30 



BsaJI Ccnngg 



38 



1 : 


23 


1 : 


40 


2: 


39 


2 : 


40 


3: 


39 


3 : 


40 


4 : 


39 


4: 


40 


5: 


39 


11: 


39 


12: 


38 


12 : 


39 


13: 


23 


13: 


39 


14: 


23 


14: 


39 


15: 


38 


16: 


39 


17: 


23 


17: 


39 


18: 


23 


18: 


39 


21: 


38 


21: 


39 


21: 


47 


22 : 


38 


22: 


39 


22: 


47 


26: 


40 


27 : 


39 


28: 


39 


29: 


14 


29: 


39 


30: 


38 


30: 


39 


30: 


47 


31: 


23 


31: 


32 



















There are 17 hits at base# 39 



There are 5 hits at 
There are 5 hits at 

Mnll cctc 



base# 38 

base# 40 Makes cleavage ragged. 

35 



1: 


23 


2 : 


23 


3: 


23 


4 : 


23 


5: 


23 


6: 


19 


6: 


23 


7: 


19 


8: 


23 


9: 


19 


9: 


23 


10: 


23 


11: 


23 


13: 


23 


14: 


23 


16: 


23 


17: 


23 


18 : 


23 


19: 


23 


20: 


47 


21: 


23 


21: 


29 


21; 


47 


22 : 


23 


22: 


29 


22: 


35 


22: 


47 


23: 


26 


23: 


29 


24: 


27 


27: 


23 


28: 


23 


30: 


35 


30: 


47 


31: 


23 







35 



There are 21 hits at 

There are 3 hits at 

There are 3 hits at 

There are 1 hits at 

There are 1 hits at 

gagg 

1: 48 2: 48 3: 48 



base# 23 
base# 19 
base# 29 
base# 26 

base# 27 These could make cleavage ragged. 

7 

. 4: 48 27: 44 28: 44 



29: 44 



BssKI Nccngg 3 9 



1 : 


40 


2: 


39 


3: 


39 


3: 


40 


4: 


39 


4: 


40 


5: 


39 


6: 


31 


6: 


39 


7: 


31 


7 : 


39 


8: 


39 


9: 


31 


9: 


39 


10: 


39 


11: 


39 


12: 


38 


12: 


52 


13: 


39 


13: 


52 


14 : 


52 


16: 


39 


16: 


52 


17: 


39 


17: 


52 


18: 


39 


18: 


52 


19: 


39 


19: 


52 


21: 


38 


22: 


38 


23: 


39 


24: 


39 


26: 


39 


27: 


39 


28: 


39 


29: 


14 


29: 


39 


30: 


38 















There are 21 hits at base# 39 
There are 4 hits at base# 38 
There are 3 hits at base# 31 



There are 3 hits at base# 40 Ragged 

n 15 

Q. BstNI CCwgg 30 



1 : 


41 


2: 


40 


5: 


40 


6: 


40 


7: 


40 


8: 


40 


9: 


40 


10: 


40 


11: 


40 


12: 


39 


12: 


53 


13: 


40 


13: 


53 


14: 


53 


16: 


40 


16: 


53 


17: 


40 


17 : 


53 


18 : 


40 


18 : 


53 


19: 


53 


21: 


39 


22: 


39 


23: 


40 


24: 


40 


27: 


40 


28: 


40 


29: 


15 


29: 


40 


30: 


39 



There are 17 hits at base# 40 
There are 7 hits at base# 53 
There are 4 hits at base# 39 



M 25 There are 1 hits at base# 41 Ragged 



PspGI ccwgg 30 



1: 


41 


2: 


40 


5: 


40 


6: 


40 


7: 


40 


8 : 


40 


9: 


40 


10: 


40 


11: 


40 


12: 


39 


12: 


53 


13: 


40 


13: 


53 


14: 


53 


16: 


40 


16: 


53 


17: 


40 


17: 


53 


18: 


40 


18: 


53 


19: 


53 


21: 


39 


22 : 


39 


23: 


40 


24 : 


40 


27: 


40 


28: 


40 


29: 


15 


29: 


40 


30: 


39 



There are 17 hits at base# 40 
There are 7 hits at base* 53 
35 There are 4 hits at base* 39 



There are 1 hits at base# 41 



ScrFI CCngg 3 9 

1: 41 2: 40 3: 40 3: 41 4: 40 4: 41 
5 5: 40 6: 32 6: 40 7: 32 7: 40 8: 40, 
9: 32 9: 40 10: 40 11: 40 12: 39 12: 53 
13: 40 13: 53 14: 53 16: 40 16: 53 17: 40 
17: 53 18: 40 18: 53 19: 40 19: 53 21: 39 
22: 39 23: 40 24: 40 26: 40 27: 40 28: 40 
10 29: 15 29: 40 30: 39 

There are 21 hits at base# 4 0 
There are 4 hits at base# 39 
There are 3 hits at base# 41 

15 Maelll gtnac 16 

1: 52 2: 52 3: 52 4: 52 5: 52 6: 52 
7: 52 9: 52 26: 52 27: 10 27: 52 28: 10 
28: 52 29: 10 29: 52 30: 52 
There are 13 hits at base# 52 

20 

Tsp45I gtsac 15 

1: 52 2: 52 3: 52 4: 52 5: 52 6: 52 

7: 52 9: 52 27: 10 27: 52 28: 10 28: 52 

29: 10 29: 52 30: 52 
25 There are 12 hits at base# 52 

HphI tcacc 2 6 

1: 53 2: 53 3: 53 4: 53 5: 53 6: 53 

7: 53 8: 53 9: 53 10: 53 11: 59 13: 59 

30 14: 59 17: 59 18: 59 19: 59 20: 59 21: 59 
22: 59 23: 59 24: 59 25: 59 27: 59 28: 59 
30: 59 31: 59 

There are 16 hits at base# 59 
There are 10 hits at base# 53 

35 



BspMI ACCTGCNNNNn 14 



11: 


61 


13: 


61 


14: 


61 


17: 


61 


18: 


61 


20: 


61 


21: 


61 


22: 


61 


23: 


61 


24: 


61 


30: 


61 


31: 


61 















There are 14 hits at base# 61 Goes into CDR1 



Table 500: h3401-h2 captured Via CJ with BsmAI 



! 1 2 3 4 5 6 7 8 9 10 11 12 131415 
1SAQDIQMTQS PATLS 
a GT GCA C aa gac ate cag atg acc cag tct cca gec acc ctg tct 
5 ! ApaLI ... a gec acc ! 

L25 7 1,6,1,20,1,2,1.16, All 

! Extender Bridge. . . 

! 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
70!VSPGERATLSCRASQ 
gtg tct cca ggg gaa agg gec acc etc tec tgc agg gec agt cag 

1 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
iSVSNNLAWYQQKPGQ 
15 agt gtt agt aac aac tta gec tgg tac cag cag aaa cct ggc cag 

! 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 

p 1VPRLLIYGASTRATD 

2 gtt ccc agg etc etc ate tat ggt gca tec acc agg gee act gat 

M 20 

iJ ! 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 

H ! I PAR-FSGSGSGTDFT 

v^- a ate cca gec agg ttc agt ggc agt ggg tct ggg aca gac ttc act 



e """ 25 ! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 
p 'LTISRLEPEDFAVYY 



40 



etc acc ate age aga ctg gag cct gaa gat ttt gca gtg tat tac 



N ! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 

C3 30 1CQRYGSSPGWTFGQG 
\* tgt cag egg tat ggt age tea ccg ggg tgg acg ttc ggc caa ggg 

! 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
!TKVE I KRTVAAP S V F 
35 acc aag gtg gaa ate aaa cga act gtg get gca cca tct gtc ttc 

! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
!I FPPSDEQLKSGTAS 
ate ttc ccg cca tct gat gag cag ttg aaa tct gga act gee tct 



! 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 
1VVCLLNNFYPREAKV 
gtt gtg tgc ctg ctg aat aac ttc tat ccc aga gag gec aaa gta 



15 



\2 20 



40 



45 



! 151 


152 


153 


154 


155 


156 


157 


158 


159 


160 


161 


162 


163 


164 


165 


! Q 


W 


K 


v 




N 




L 




q 




In 


q 




ill 


caa 


taa 


aag 


atcr 


gat 


aac 


gcc 


etc 


caa 


teg 


ctcxt~ 

yy L 


^> 






gag 


! 166 


167 


168 


169 


170 


171 


172 


173 


174 


175 


176 


177 


178 


179 


180 


! S 


v 


T 






D 


q 


rv 


n 
u 




i 


v 

X 




T 

li 


o 


agt 


gtc 


aca 




cag 


gac 


age 


aag 


gac 


ex y o 


/— 


i - /— • 




r* "t~ c* 


age 


! 181 


182 


183 


184 


185 


186 


187 


188 


189 


190 


191 


192 


193 


194 


195 


! S 


T 


L 


T 


T, 

XJ 


q 




A 


n 
u 


Y 

X 




IT 


TT 

ri 


i\ 


V 


age 


acc 


ct g 


acg 


ct g 


age 


G CLCL 


Cl C P\ 




L_ do 


gag 


aaa 


cac 


aaa 


gtc 


! 196 


197 


198 


199 


200 


201 


202 


203 


204 


205 


206 


207 


208 


209 


210 


! Y 


A 


C 


E 


V 


T 


H 


Q 


G 


L 


S 


S 


P 


V 


T 


tac 


gcc 


tgc 


gaa 


gtc 


acc 


cat 


cag 


ggc 


ctg 


age 


teg 


cct 


gtc 


aca 


! 211 


212 


213 


214 


215 


216 


217 


218 


219 


220 


221 


222 


223 






! K 


S 


F 


N 


K 


G 


E 


C 


K 


G 


E 


F 


A 






aag 


age 


ttc 


aac 


aaa 


gga 


gag 


tgt 


aag 


ggc 


gaa 


ttc 


gc. , 






Table 


501: 


h3 4 01-d8 KAPPA 


captured with CJ and BsmAI 






! 1 


2 


3 


4 


5 


6 


1 


8 


9 


10 


11 


12 


13 






! S 


A 


Q 


D 


I 


Q 


M 


T 


Q 


S 


P 


A 


T 




q 


aGT 


GCA 


Caa 


gac 


ate 


cag 


atg 


acc 


cag 


tct 


cct 


gcc 


acc 


ctg 


tct 


! ApaLI . . 
















gcc 


acc 


t 




1*25,1,6,1,20 




L16, 


All 






















I 




















A 


GCC 


ACC 


CTG 




! 16 


17 


18 


19 


20 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 


! V 


S 


P 


G 


E 


R 


A 


T 


L 


S 


C 


R 


A 


S 


Q 


gtg 


tct 


cca 


ggt 


gaa 


aga 


gcc 


acc 


etc 


tec 


tgc 


agg 


gcc 


agt 


cag 


! GTG 


TCT 


CCA 


GGG 


GAA 


AGA 


GCC 


ACC 


CTC 


TCC 


TGC 


! 


L2 




! 31 


32 


33 


34 


35 


36 


37 


38 


39 


40 


41 


42 


43 


44 


45 


! N 


L 


L 


S 


N 


L 


A 


W 


Y 


Q 


Q 


K 


P 


G 


Q 


aat 


ctt 


etc 


age 


aac 


tta 


gcc 


tgg 


tac 


cag 


cag 


aaa 


cct 


ggc 




! 46 


47 


48 


49 


50 


51 


52 


53 


54 


55 


56 


57 


58 


59 


60 


! A 


P 


R 


L 


L 


I 


Y 


G 


A 


S 


T 


G 


A 


I 


G 


get 


ccc 


agg 


etc 


etc 


ate 


tat 


ggt 


get 


tec 


acc 


ggg 


gcc 


att 


ggt 


! 61 


62 


63 


64 


65 


66 


67 


68 


69 


70 


71 


72 


73 


74 


75 


! I 


P 


A 


R 


F 


S 


G 


S 


G 


S 


G 


T 


E 


F 


T 


ate 


cca 


gcc 


agg 


ttc 


agt 


ggc 


agt 


ggg 


tct 


ggg 


aca 


gag 


ttc 


act 



20 



\ 1 6 


77 


78 


79 


80 


81 


82 


83 


84 


85 


86 


87 


88 


89 


90 


! L 


T 


I 


S 


S 


L 


Q 


S 


E 


D 


F 


A 


V 


Y 


F 


etc 


acc 


ate 


age 


age 


ctg 


cag 


tct 


gaa 


gat 


ttt 


gca 
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